Skip to content

Feature Request: #17173

@linuxmagic-mp

Description

@linuxmagic-mp

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Background: Problems using Mistral Nemo Instruct 2407 with tools, The root issue is that even though the model is tools aware, it expects tools to be provided in a non-aicompatible format, eg wrapped in [AVAIALBLE_TOOLS] however, and thus the use of the orchastori and .jinja templates, to rewite the oaicompatbile to mistral compatible. However, this creates an issue. If you decide to define the tools in the .jinja template, that rewrites the prompt, but the server sees no 'tools' in the original prompt, and doesn't set the logic or params to use tools, and thus the task will not examine the result from the LLM for tool calls. (Ashamed to admit, this took me a lot longer to shake out than it should.) Technically, of course the server should see the LLM as tool aware, from the GGUF itslef, and pull out the 'Mistral Nemo' name, and set it as tool aware, which doesn't appear to occur. The logic appears flawed... eg
{{{
// Plain handler (no tools)
if (params.tools.is_null() || inputs.tool_choice == COMMON_CHAT_TOOL_CHOICE_NONE) {
if (params.tools.is_null()) {
LOG_DBG("MP: Short circuit, doens't reach Nemo, tools is null\n");
}
return common_chat_params_init_without_tools(tmpl, params);
}

// Mistral Nemo (w/ tools)
if (src.find("[TOOL_CALLS]") != std::string::npos) {
    return common_chat_params_init_mistral_nemo(tmpl, params);
}

// Generic fallback
return common_chat_params_init_generic(tmpl, params);

}
}}}

As you can see the mistral parsing never gets reached, because tools.is_null() still, in common_chat_templates_apply_jinja()

Now, before I start looking at a pull request, this appears to be a logic problem, that should be first discussed at the design level, on how to best approach this.

HTTP request

oaicompat_chat_params_parse()
↓ produces JSON "data"
orchestrator (Jinja template) runs ONLY to build text prompt

params_from_json_cmpl() copies fields from "data" → task.params

task submitted to inference queue

LLM generates output

common_chat_parse() checks task.params.tools

My 'suggestion' is to audit this, so that the server knows the model is tool_aware, based on the .gguf template. Then, we still have to indicate to the task that there are 'tools' available, and this conclusion cannot be 100% met, until after the orchastrator runs.. so either the orchastrator needs to be responsible for setting the params, or the server needs to recognize that a Mistral style .jinja template was used, and parse the resulting prompt again from [AVAILABLE_TOOLS] block.

I need feedback on how this should be approached..

Motivation

Bug, Jinja templates cannot set parmas for tools.

Possible Implementation

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions