Models with multiple chat templates #1336

CISC · 2024-04-08T23:16:32Z

Not an issue yet, but will need to be handled once this is implemented based on recent transformers changes.

Also note the kwargs change in the same PR which will be used by f.ex. C4AI Command R models (new chat template is not merged yet) to pass along tools and documents, while we already support tools, it might be worthwhile to support other things.

The text was updated successfully, but these errors were encountered:

CISC · 2024-04-25T12:20:14Z

Any suggestions on how to approach this? It has been merged in llama.cpp a while now, and many GGUFs already have the new metadata.

I suppose adding f.ex. a chat_template_name parameter and applying the chosen template (if found - should also output which templates are available (from tokenizer.chat_templates list) I guess) would be the initial step.

For server this gets more complicated, it would probably make sense to allow the caller to choose a template, and then also have an endpoint to see which templates are available?

Finally, how would you go about adding support for additional parameters to the template, like documents in the rag template?

abetlen · 2024-04-25T14:00:39Z

@CISC do you mind posting a gguf that uses this right now.

Yeah I think we can do even more simple and not introduce any new parameters just use the existing chat_format. The default when no chat_format is specified is to use default then the others can just be specified there by name.

The chat formats will be accessible through the metadata not sure if we need to add anything new there but we should add an option to change chat format after initialization (I believe this has already been requested before).

CISC · 2024-04-25T15:49:28Z

Sure, pmysl was the first one to update their quants. If R+ is a bit too hefty, try LlamaEdge's Command R quant.

My main worry about using chat_format is that it might conflict with an existing choice, albeit unlikely.

abetlen · 2024-04-26T01:24:27Z

@CISC good point, let's prefix these dynamically loaded chat templates with chat_template so chat_template.rag or chat_template.tool_use for the cohere model.

CISC · 2024-04-27T16:37:01Z

@abetlen That seems reasonable, I'm thinking registering chat_template.default etc. as chat format at init with the Jinja2 handler setup done as fallback today and then just fall back to chat_template.default(if registered) instead.

CISC · 2024-05-28T13:34:25Z

WIP changes worth paying attention to: huggingface/transformers#30621

CISC · 2024-06-15T12:53:06Z

Another related PR is this one huggingface/transformers#31429 which could be nice to replicate here, however requires us to differentiate from specifically selecting chat_template.default and defaulting to it as we may not want to force chat_template.tool_use just because tools are passed.

abetlen added the enhancement New feature or request label Apr 9, 2024

CISC mentioned this issue Apr 27, 2024

Support multiple chat templates - step 1 #1396

Merged

CISC mentioned this issue May 9, 2024

Support multiple chat templates - step 2 #1440

Open

CISC mentioned this issue Jun 20, 2024

Render chat template tojson filter as unicode #1486

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models with multiple chat templates #1336

Models with multiple chat templates #1336

CISC commented Apr 8, 2024

CISC commented Apr 25, 2024 •

edited

Loading

abetlen commented Apr 25, 2024

CISC commented Apr 25, 2024

abetlen commented Apr 26, 2024

CISC commented Apr 27, 2024

CISC commented May 28, 2024

CISC commented Jun 15, 2024

Models with multiple chat templates #1336

Models with multiple chat templates #1336

Comments

CISC commented Apr 8, 2024

CISC commented Apr 25, 2024 • edited Loading

abetlen commented Apr 25, 2024

CISC commented Apr 25, 2024

abetlen commented Apr 26, 2024

CISC commented Apr 27, 2024

CISC commented May 28, 2024

CISC commented Jun 15, 2024

CISC commented Apr 25, 2024 •

edited

Loading