Skip to content

It is recommended to use a dedicated device for vLLM #3719

@BartekKruczek

Description

@BartekKruczek

Hello,

I'm following your's GRPO multi-GPU approach on SLURM environment. I do understand why we would like to use separate GPU card(s) for vLLM deployment, however I've got errors with NPROC_PER_NODE. I was told that its number should be equal to GPU's count which is incorrect with separate cards for vLLM idea. I can use max 4 GPUs per one node.

AssertionError: Colocate mode requires device_count(4) == num_infer_workers(4). Please check if your device count matches NPROC_PER_NODE setting.

Any idea's why?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions