Feature Request:

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

Background: Problems using Mistral Nemo Instruct 2407 with tools, The root issue is that even though the model is tools aware, it expects tools to be provided in a non-aicompatible format, eg wrapped in [AVAIALBLE_TOOLS] however, and thus the use of the orchastori and .jinja templates, to rewite the oaicompatbile to mistral compatible.  However, this creates an issue. If you decide to define the tools in the .jinja template, that rewrites the prompt, but the server sees no 'tools' in the original prompt, and doesn't set the logic or params to use tools, and thus the task will not examine the result from the LLM for tool calls. (Ashamed to admit, this took me a lot longer to shake out than it should.)  Technically, of course the server should see the LLM as tool aware, from the GGUF itslef, and pull out the 'Mistral Nemo' name, and set it as tool aware, which doesn't appear to occur.  The logic appears flawed... eg 
{{{
    // Plain handler (no tools)
    if (params.tools.is_null() || inputs.tool_choice == COMMON_CHAT_TOOL_CHOICE_NONE) {
        if (params.tools.is_null()) {
            LOG_DBG("MP: Short circuit, doens't reach Nemo, tools is null\n");
        }
        return common_chat_params_init_without_tools(tmpl, params);
    }

    // Mistral Nemo (w/ tools)
    if (src.find("[TOOL_CALLS]") != std::string::npos) {
        return common_chat_params_init_mistral_nemo(tmpl, params);
    }

    // Generic fallback
    return common_chat_params_init_generic(tmpl, params);
}
}}}

As you can see the mistral parsing never gets reached, because tools.is_null() still, in common_chat_templates_apply_jinja()

Now, before I start looking at a pull request, this appears to be a logic problem, that should be first discussed at the design level, on how to best approach this.

HTTP request
   ↓
oaicompat_chat_params_parse()
   ↓  produces JSON "data"
orchestrator (Jinja template) runs ONLY to build text prompt
   ↓
params_from_json_cmpl() copies fields from "data" → task.params
   ↓
task submitted to inference queue
   ↓
LLM generates output
   ↓
common_chat_parse() checks task.params.tools

My 'suggestion' is to audit this, so that the server knows the model is tool_aware, based on the .gguf template.  Then, we still have to indicate to the task that there are 'tools' available, and this conclusion cannot be 100% met, until after the orchastrator runs.. so either the orchastrator needs to be responsible for setting the params, or the server needs to recognize that a Mistral style .jinja template was used, and parse the resulting prompt again from [AVAILABLE_TOOLS] block.

I need feedback on how this should be approached.. 


### Motivation

Bug, Jinja templates cannot set parmas for tools. 

### Possible Implementation

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: #17173

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: #17173

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions