Support using chat template in the tokenizer and use the generation config in model for vllm and openai endpoint #3351

zeyugao · 2024-05-21T04:03:29Z

Why are these changes needed?

We can directly utilize the chat_template in the tokenizer to ease the effect to keep the converstion.py up to date.

zeyugao added 3 commits May 21, 2024 11:55

Support use the chat template in tokenizer

3f65646

Only override the params when is None

85dc17f

Format

496a081