Skip to content

grpo中的async模式是否能够支持tensor_parallel_size>1 #3712

@kangyishuai

Description

@kangyishuai

举个例子:
可见显卡数量 = 8
训练显卡数量 = 6
推理显卡数量 = 2
tensor_parallel_size = 2
这样一个参数量大的模型或者token较长的模型才有足够的显存进行推理

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions