[benchmark] update benchmark tools by ZhangYulongg · Pull Request #6991 · PaddlePaddle/FastDeploy

ZhangYulongg · 2026-03-24T09:54:36Z

Motivation

benchmark工具支持多轮对话token in token out
开启--multi-turn
--tokenizer-model：使用prompt_token_ids请求时指定，多轮对话tokenizer模型类型，可选"eb": ErnieBotTokenizer, "eb5": Ernie5Tokenizer, "eb_mm": Ernie4_5Tokenizer
--tokenizer-path：使用prompt_token_ids请求时指定，模型tokenizer路径

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-03-24T09:54:52Z

Thanks for your contribution!

EmmonsCurse · 2026-03-24T09:58:01Z

/skip-ci ci_iluvatar
/skip-ci ci_hpu
/skip-ci build_gpu
/skip-ci build_xpu

Copilot

Pull request overview

该 PR 主要为 benchmarks 压测工具补充“多轮对话 token-in / token-out”的请求模式：在 multi-turn 场景下支持通过 prompt_token_ids 发送请求，并从流式返回中收集 completion_token_ids 以完成后续轮次 token 拼接。

Changes:

新增 tokenizer 相关依赖清单与本地 tokenizer 实现（用于 token_ids 拼接与模板化）。
benchmark_serving 新增 --tokenizer-model/--tokenizer-path 参数，并把参数透传到请求侧。
backend_request_func 支持 prompt_token_ids 请求、收集 completion_token_ids，并在 multi-turn 中引入 token_ids 拼接逻辑。

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
benchmarks/requirements_tokenizer.txt	新增 tokenizer/多轮 token_ids 模式所需依赖列表
benchmarks/ernie_tokenizer.py	新增 Ernie 系 tokenizer 实现，供 benchmark 侧加载使用
benchmarks/benchmark_serving.py	增加 tokenizer 参数透传；调整 initial test run 逻辑
benchmarks/backend_request_func.py	增加 prompt_token_ids/return_token_ids 支持与 multi-turn token 拼接
benchmarks/README.md	补充 multi-turn 的 token_ids 模式参数说明

Copilot · 2026-03-24T10:04:53Z

@@ -504,10 +584,50 @@ async def async_request_eb_openai_chat_completions_multi_turn(
    ) as session:
        for i, message in enumerate(ori_history):
            if message["role"] == "user" or message["role"] == "tool":
+                if i == 15:
+                    break
                history.append(message)
                round_input = copy.deepcopy(request_func_input)
                round_input.history_QA = history
                round_input.no = f"{round_input.no}_{prompt_no}"
+                if use_token_ids:
+                    if len(input_ids_all) == 0:
+                        # 拼接token_ids模式，首轮token_ids
+                        spliced_text = tokenizer.apply_chat_template(
+                            history,
+                            tokenize=False,
+                            split_special_tokens=False,
+                            add_special_tokens=False,
+                        )
+                        # 转换为token ids


token_ids 拼接模式下 tokenizer = load_tokenizer(...) 可能返回 None（当前实现会 catch 所有异常并仅 warning），但后面马上调用 tokenizer.apply_chat_template(...)，会直接抛 AttributeError。建议：1) load_tokenizer 失败时抛出带路径/模型类型的异常或至少 return 前显式检查；2) 在进入 token_ids 模式前校验 tokenizer_path 存在且 tokenizer 不为 None。

Copilot · 2026-03-24T10:04:53Z

+多轮对话使用prompt_token_ids模式请求
+```bash
+开启--multi-turn
+--tokenizer-model：使用prompt_token_ids请求时指定，多轮对话tokenizer模型类型，可选"eb": ErnieBotTokenizer, "eb5": Ernie5Tokenizer, "eb_mm": Ernie4_5Tokenizer
+--tokenizer-path：使用prompt_token_ids请求时指定，模型tokenizer路径
+```


PR 标题 tag 似乎不符合仓库约定的 [CLASS]Title 格式：目前是 [benchmark] ...（小写且不在 checklist 的 tag 列表里）。建议改为 [Benchmark] Update benchmark tools 或与该 PR 语义匹配的规范 tag。另：PR 描述里 Modifications / Usage 仍是占位符，建议补充具体改动点和如何运行验证。

Copilot · 2026-03-24T10:04:53Z

+    prompt_token_ids: Optional[list] = None
+    tokenizer_model: str = None
+    tokenizer_path: str = None


RequestFuncInput 里 tokenizer_model/tokenizer_path 声明为 str 但默认值是 None；prompt_token_ids 也更接近 Optional[list[int]]。建议把类型标注改成 Optional[str] / Optional[list[int]]，避免类型检查和 IDE 推断误导。

Suggested change

prompt_token_ids: Optional[list] = None

tokenizer_model: str = None

tokenizer_path: str = None

prompt_token_ids: Optional[list[int]] = None

tokenizer_model: Optional[str] = None

tokenizer_path: Optional[str] = None

Copilot · 2026-03-24T10:04:54Z

+        return encoded_inputs
+
+
+hack_uft16_ascii = True


变量名 hack_uft16_ascii 中 uft 疑似拼写错误（应为 utf），后续代码也以此为开关使用。建议更正为 hack_utf16_ascii（如需兼容可保留旧名别名），避免误解。

Suggested change

hack_uft16_ascii = True

# NOTE: `hack_uft16_ascii` is kept as backward-compatible alias; please use `hack_utf16_ascii` instead.

hack_utf16_ascii = True

hack_uft16_ascii = hack_utf16_ascii

Copilot · 2026-03-24T10:04:54Z

+        self.use_oov_uft_16_be = True  # True # oov是否使用uft_16_be编码
+        logger.info(f">>> UTF_16_BE: self.use_oov_uft_16_be:{self.use_oov_uft_16_be}")
+
+    def set_oov_utf_16_be(self, use_oov_uft_16_be=True):
+        """
+        use_oov_uft_16_be 开关
+        """
+        self.use_oov_uft_16_be = use_oov_uft_16_be
+        print(f"use_oov_uft_16_be:{self.use_oov_uft_16_be}")


属性名 use_oov_uft_16_be 中 uft 疑似拼写错误（应为 utf）。建议统一更名为 use_oov_utf_16_be，以免后续调用/检索时混淆。

Suggested change

self.use_oov_uft_16_be = True # True # oov是否使用uft_16_be编码

logger.info(f">>> UTF_16_BE: self.use_oov_uft_16_be:{self.use_oov_uft_16_be}")

def set_oov_utf_16_be(self, use_oov_uft_16_be=True):

"""

use_oov_uft_16_be 开关

"""

self.use_oov_uft_16_be = use_oov_uft_16_be

print(f"use_oov_uft_16_be:{self.use_oov_uft_16_be}")

self.use_oov_utf_16_be = True # True # whether to use UTF_16_BE encoding for OOV

logger.info(f">>> UTF_16_BE: self.use_oov_utf_16_be:{self.use_oov_utf_16_be}")

def set_oov_utf_16_be(self, use_oov_utf_16_be: bool = True):

"""

Toggle use_oov_utf_16_be switch.

"""

self.use_oov_utf_16_be = use_oov_utf_16_be

print(f"use_oov_utf_16_be:{self.use_oov_utf_16_be}")

Copilot · 2026-03-24T10:04:54Z

+    if not debug:
+        print("test_input:", test_input)

-    test_output = await request_func(request_func_input=test_input)
+        test_output = await request_func(request_func_input=test_input)

-    if args.multi_turn:
-        out_list, metrics = test_output
-        test_output = out_list[0]
+        if args.multi_turn:
+            out_list, metrics = test_output
+            test_output = out_list[0]

-    if not test_output.success:
-        print("test_output:", test_output, flush=True)
-        raise ValueError(
-            f"Initial test run failed - Please make sure that 1. benchmark arguments are correctly specified and 2. the http_proxy and https_proxy are turned off. Error: {test_output.error}"
-        )
-    else:
-        print("Initial test run completed. Starting main benchmark run...")
+        if not test_output.success:
+            print("test_output:", test_output, flush=True)
+            raise ValueError(
+                f"Initial test run failed - Please make sure that 1. benchmark arguments are correctly specified and 2. the http_proxy and https_proxy are turned off. Error: {test_output.error}"
+            )
+        else:
+            print("Initial test run completed. Starting main benchmark run...")


benchmark 的 initial single prompt test run 现在被放进了 if not debug: 分支：开启 --debug 反而不会执行初始连通性/参数校验，也不会打印 test_input，后续会直接进入主压测，容易把错误延后暴露。建议始终执行 initial test run（或改成显式的 --skip-initial-test 开关），并把 debug 打印逻辑与是否执行测试解耦。

Copilot · 2026-03-24T10:04:54Z

+                    Ernie5Tokenizer.resource_files_names["vocab_file"] = vocab_file_names[i]
+                    break
+            tokenizer = Ernie5Tokenizer.from_pretrained(actor_tokenizer_path)
+        else:
+            print("tokenizer: AUTO")
+            tokenizer = AutoTokenizer.from_pretrained(actor_tokenizer_path, padding_side="left", use_fast=True)
+    except Exception as e:
+        tokenizer = None
+        logging.warning(f"Load tokenizer error: {e}")


load_tokenizer() 中对 Ernie5Tokenizer 使用了 resource_files_names 属性（如 Ernie5Tokenizer.resource_files_names[...]），但本 PR 新增的 Ernie5Tokenizer 类实际定义的是 vocab_files_names（benchmarks/ernie_tokenizer.py:927）。这会导致 eb5 分支触发 AttributeError，被 except 吞掉后返回 tokenizer=None，后续 token_ids 模式会在 tokenizer.apply_chat_template 处崩溃。建议统一属性名（Ernie5Tokenizer 侧改为 resource_files_names 或这里改用 vocab_files_names），并不要静默吞掉该错误。

Suggested change

Ernie5Tokenizer.resource_files_names["vocab_file"] = vocab_file_names[i]

break

tokenizer = Ernie5Tokenizer.from_pretrained(actor_tokenizer_path)

else:

print("tokenizer: AUTO")

tokenizer = AutoTokenizer.from_pretrained(actor_tokenizer_path, padding_side="left", use_fast=True)

except Exception as e:

tokenizer = None

logging.warning(f"Load tokenizer error: {e}")

Ernie5Tokenizer.vocab_files_names["vocab_file"] = vocab_file_names[i]

break

tokenizer = Ernie5Tokenizer.from_pretrained(actor_tokenizer_path)

else:

print("tokenizer: AUTO")

tokenizer = AutoTokenizer.from_pretrained(actor_tokenizer_path, padding_side="left", use_fast=True)

except Exception as e:

logging.error("Load tokenizer error", exc_info=True)

raise

* [benchmark] update benchmark tools * [benchmark] update benchmark tools

[benchmark] update benchmark tools

4d70323

ZhangYulongg temporarily deployed to Metax_ci March 24, 2026 09:54 — with GitHub Actions Inactive

paddle-bot Bot added the contributor External developers label Mar 24, 2026

juncaipeng requested a review from Copilot March 24, 2026 09:59

Copilot started reviewing on behalf of juncaipeng March 24, 2026 09:59 View session

EmmonsCurse previously approved these changes Mar 24, 2026

View reviewed changes

Copilot AI reviewed Mar 24, 2026

View reviewed changes

[benchmark] update benchmark tools

ea3717b

ZhangYulongg dismissed EmmonsCurse’s stale review via ea3717b March 24, 2026 12:53

ZhangYulongg temporarily deployed to Metax_ci March 24, 2026 12:53 — with GitHub Actions Inactive

EmmonsCurse approved these changes Mar 24, 2026

View reviewed changes

ZhangYulongg merged commit 6f5aa88 into PaddlePaddle:develop Mar 24, 2026
10 of 14 checks passed

mattheliu pushed a commit to mattheliu/FastDeploy that referenced this pull request Apr 1, 2026

[benchmark] update benchmark tools (PaddlePaddle#6991)

62ee380

* [benchmark] update benchmark tools * [benchmark] update benchmark tools

xiaoguoguo626807 pushed a commit to xiaoguoguo626807/FastDeploy that referenced this pull request May 7, 2026

[benchmark] update benchmark tools (PaddlePaddle#6991)

d0742c7

* [benchmark] update benchmark tools * [benchmark] update benchmark tools

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[benchmark] update benchmark tools#6991

[benchmark] update benchmark tools#6991
ZhangYulongg merged 2 commits into
PaddlePaddle:developfrom
ZhangYulongg:update_benchmark_0323

ZhangYulongg commented Mar 24, 2026

Uh oh!

paddle-bot Bot commented Mar 24, 2026

Uh oh!

EmmonsCurse commented Mar 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-hack_uft16_ascii = True
+# NOTE: `hack_uft16_ascii` is kept as backward-compatible alias; please use `hack_utf16_ascii` instead.
+hack_utf16_ascii = True
+hack_uft16_ascii = hack_utf16_ascii

Uh oh!

Conversation

ZhangYulongg commented Mar 24, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot Bot commented Mar 24, 2026

Uh oh!

EmmonsCurse commented Mar 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants