Skip to content

[BUG]: unknown templates generate exception #1032

@phil-scott-78

Description

@phil-scott-78

Description

Using a model without a matching template, use LlamaTemplate passing in the model to pull the template from the model. Calling Apply results in an index out of range exception. The reason is that llama_chat_apply_template returns a -1 instead of the length when there isn't a matching model, and resulting in this line failing

output.AsSpan(0, outputLength).CopyTo(_result);

Reproduction Steps

Using Ministral-8B-Instruct-2410-Q6_K_L.gguf, use LlamaTemplate passing in the model to pull the template from the model. The StatelessModeExecute is an easy way to reproduce now that it is tries to apply the system template, but anything using LlamaTemplate will get an exception.

Environment & Configuration

  • Operating system: Windows 11
  • .NET runtime version: .net 9
  • LLamaSharp version: main branch
  • CUDA version (if you are using cuda backend): 12

Known Workarounds

Following the llama.cpp lead (https://github.com/ggerganov/llama.cpp/blob/master/src/llama.cpp#L23348), if we can't find a template then apply Chatml. Funny enough this doesn't work at all for the model I'm testing, but it feels like the proper behavior until llama.cpp gets updated to support it, I suppose. llama.cpp has llama_chat_detect_template they can call to complete circumvent even trying to apply the template which would probably be the move, but that's no exposed. in the meantime, this seems to work

var outputLength = ApplyInternal(_nativeChatMessages.AsSpan(0, Count), output);
if (outputLength == -1)
{
    // worst case: there is no information about template, we will use chatml by default
    outputLength = ApplyChatmlInternal(_nativeChatMessages.AsSpan(0, Count), output);
}
// snip
unsafe int ApplyChatmlInternal(Span<LLamaChatMessage> messages, byte[] output)
{
    fixed (byte* customTemplatePtr = Encoding.GetBytes("chatml\0"))
    fixed (byte* outputPtr = output)
    fixed (LLamaChatMessage* messagesPtr = messages)
    {
        return NativeApi.llama_chat_apply_template(_model, customTemplatePtr, messagesPtr, (nuint)messages.Length, AddAssistant, outputPtr, output.Length);
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions