ChatGLM3 output token size is not generated as expected #390

JunxiChhen · 2024-04-26T02:53:47Z

Benchmark cmd (-ic is 128):

numactl -C 0-55 -m 0 python benchmark.py     -m /root/.cache/huggingface/hub/chatglm3-6b-ov/pytorch/dldt/FP16     -p "It is done, and submitted..."     -n 2     -bs 1     -d CPU --torch_compile_backend openvino     -ic 128 --num_beams 1 -lc bfloat16_config.json 2>&1 | tee -a ./logs/0.log

BTW, ChatGLM2's output size is right.

The text was updated successfully, but these errors were encountered:

peterchen-intel · 2024-05-05T12:50:54Z

@JunxiChhen The reason is that ChatGLM3 gives "end token" at output size 17 with the input prompt. The WA is to update prompt to let ChatGLM3 generate more tokens (>=128).
-ic means the maximum output token size (or maximum inference number).
There is a ongoing #289 which is trying to "force" generate expected output size even "end token" is given.

peterchen-intel · 2024-05-14T01:42:09Z

@JunxiChhen PR has been merged. Please verify.

yangkunx · 2024-05-14T05:35:16Z

When I update the latest commit id, run case with blew error:

peterchen-intel · 2024-05-20T09:43:06Z

@yangkunx This new issue should be fixed by PR#435

peterchen-intel · 2024-05-22T13:39:27Z

#289 was just reverted, due to the performance impact. New PR #457 is WIP

peterchen-intel · 2024-05-28T06:28:51Z

New PR #457 was merged. @yangkunx @JunxiChhen please verify.

Wovchena assigned peterchen-intel Apr 26, 2024

peterchen-intel assigned JunxiChhen May 14, 2024

peterchen-intel closed this as completed Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChatGLM3 output token size is not generated as expected #390

ChatGLM3 output token size is not generated as expected #390

JunxiChhen commented Apr 26, 2024

peterchen-intel commented May 5, 2024

peterchen-intel commented May 14, 2024

yangkunx commented May 14, 2024

peterchen-intel commented May 20, 2024

peterchen-intel commented May 22, 2024

peterchen-intel commented May 28, 2024

ChatGLM3 output token size is not generated as expected #390

ChatGLM3 output token size is not generated as expected #390

Comments

JunxiChhen commented Apr 26, 2024

peterchen-intel commented May 5, 2024

peterchen-intel commented May 14, 2024

yangkunx commented May 14, 2024

peterchen-intel commented May 20, 2024

peterchen-intel commented May 22, 2024

peterchen-intel commented May 28, 2024