-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Labels
content_check_passedissue content check passedissue content check passedenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Description
问题/痛点描述
asibench需要支持更低的请求率,当前最低只支持0.1,希望支持到最低0.001;
分布式边云协同场景在DS-R1-671B模型下,跑一个128K的prefill需要近50s,我们希望在上一个prefill结束之后,再打进来一个新的请求,因此需要支持更低的请求率,建议支持到0.001。
建议方案
直接修改最低支持的请求率为0.001,然后可以在低于0.1请求率的请求下报出warning让观察到即可。
备选方案
无
预期价值
请求率支持范围更广,更灵活,性能测试场景更丰富。
参与意向
- 我愿意参与此功能的开发或测试
Reactions are currently unavailable
Metadata
Metadata
Labels
content_check_passedissue content check passedissue content check passedenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed