Skip to content

Conversation

@Hzfengsy
Copy link
Member

This one fixes error introduced in #780, where max_seq_len is set to -1 when max_seq_len is not specified in the config file.

This one fixes error introduced in mlc-ai#780, where `max_seq_len` is set to
`-1` when `max_seq_len` is not specified in the config file.
@peacepenguin
Copy link

Can confirm this fixes mlc-llm rocm with an amd 6800 xt. Can now use the saml_mlc_chat.py no problem.

FYI:
Statistics: prefill: 486.6 tok/s, decode: 72.3 tok/s

Thanks so much @Hzfengsy for this patch!

@tqchen tqchen merged commit 66550e0 into mlc-ai:main Aug 23, 2023
@junrushao
Copy link
Member

Thanks @Hzfengsy for the swift fix!

@masahi
Copy link
Contributor

masahi commented Aug 24, 2023

It's better to assume max_seq_len = 2048 (like before) or use max_sequence_length in config if not specified via cmd line, otherwise split rotary fusion will get disabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants