[MODEL] Make constant vocab size for models instead dynamic #2290

krishnaraj36 · 2024-05-07T07:17:00Z

Used model config vocab size in model compilation instead dynamic variable, which degrades the performance of chat application.

krishnaraj36 · 2024-05-07T07:17:37Z

@srkreddy1238 @tqchen : Please take a look to this PR and let me know your advice.

tqchen · 2024-05-07T11:53:00Z

Thanks @krishnaraj36 this is a tradeoff we need to make. Since constant vocab size would mean same lib cannot be shared across fine tunes. I think we should make it an option that compiler can take but not default.
Do you know the impacted perf? I know on nvidia gpus the impact is minimum

krishnaraj36 · 2024-05-07T12:00:06Z

Thanks @krishnaraj36 this is a tradeoff we need to make. Since constant vocab size would mean same lib cannot be shared across fine tunes. I think we should make it an option that compiler can take but not default. Do you know the impacted perf? I know on nvidia gpus the impact is minimum

There is 15-20% impacting for few set of LLM models like Gemma-2b, Mistral-7b etc

tqchen · 2024-05-07T12:25:04Z

Get it, i assume it is on adreno? In this case I think it would be great if we can add a flag to gen_config command,

--static-vocab-size, which can be turned on optionally, and write "static_vocab_size" as an flag into mlc-chat-config.json

That then will impact the compiled binary. The default case can still use dynamic vocab-size

krishnaraj36 · 2024-05-08T11:28:34Z

Thanks @tqchen for review!

We have enhance the opencl gemv schedules to resolve this issue with very minor perf compromise.
(apache/tvm#16973).

Closing this PR.

Thanks

[MODEL] Make constant vocab size for models instead dynamic

41dd4de

Used model config vocab size in model compilation instead dynamic variable, which degrades the performance of chat application.

krishnaraj36 closed this May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MODEL] Make constant vocab size for models instead dynamic #2290

[MODEL] Make constant vocab size for models instead dynamic #2290

krishnaraj36 commented May 7, 2024

krishnaraj36 commented May 7, 2024

tqchen commented May 7, 2024 •

edited

Loading

krishnaraj36 commented May 7, 2024 •

edited

Loading

tqchen commented May 7, 2024 •

edited

Loading

krishnaraj36 commented May 8, 2024 •

edited

Loading

[MODEL] Make constant vocab size for models instead dynamic #2290

[MODEL] Make constant vocab size for models instead dynamic #2290

Conversation

krishnaraj36 commented May 7, 2024

krishnaraj36 commented May 7, 2024

tqchen commented May 7, 2024 • edited Loading

krishnaraj36 commented May 7, 2024 • edited Loading

tqchen commented May 7, 2024 • edited Loading

krishnaraj36 commented May 8, 2024 • edited Loading

tqchen commented May 7, 2024 •

edited

Loading

krishnaraj36 commented May 7, 2024 •

edited

Loading

tqchen commented May 7, 2024 •

edited

Loading

krishnaraj36 commented May 8, 2024 •

edited

Loading