Feature Request: Add a debug option to display OpenAI-Compatible toolcall chunks in the WebUI

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

### TL;DR
Add a debug option to the WebUI that displays raw **toolcall chunks** (like reasoning blocks) and lets users inject custom Harmony-formatted tool documentation.  
A simple and transparent way to inspect model behavior and help the community improve `llama.cpp`.

### Summary
Introduce a new **WebUI Settings** option to display **OpenAI-Compatible toolcall chunks**, similar to the existing `reasoning_content` (thinking blocks) display.  

This idea was inspired by [PR #13501](https://github.com/ggml-org/llama.cpp/pull/13501) by @samolego, who did excellent exploratory work on tool calling in the WebUI.  
Even though that PR was eventually closed, it sparked this simpler and safer approach: a read-only visualization of the **Harmony `toolcall` field** (if present) that fits cleanly into the existing WebUI logic.

### Rationale
This feature would include a small optional text field to inject custom tool documentation. Together, the checkbox and input field would turn the WebUI into a lightweight debugging console: useful for verifying model compatibility or observing backend behavior during refactoring.

### Proposal
- Add a checkbox in **Settings** (e.g. *Show toolcall chunks*).  
- When enabled, the WebUI displays toolcall-related chunks as structured blocks, similar to reasoning content.  
- Add an optional empty **Tool prompt** field to inject custom tool documentation, formatted according to the Harmony specification, directly into the JSON request.  || *Using the existing "Custom JSON parameters to send to the API. Must be valid JSON format." field may work!*
- No runtime execution, no security implications, no additional parsing complexity — purely a read-only display option.

### Benefits
- Fully consistent with the existing **OpenAI-Compatible API** logic.  
- Helps developers debug and understand model outputs in real time.  
- Zero execution risk (read-only visualization).  
- Educational for those learning about tool calls and chunked responses.  
- No impact on inference stability or backend performance.

@ngxson @allozaur @ggerganov

### Motivation

### Motivation
The main goal is transparency: allowing users to see the exact toolcall chunks emitted by models in real time, without executing anything client-side.  
It provides valuable insight for debugging, education, and development of larger integrations built on top of `llama.cpp`.  
This also aligns perfectly with ongoing refactoring work and non-regression testing, helping ensure consistent and predictable behavior across models and backend changes.


### Possible Implementation

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Add a debug option to display OpenAI-Compatible toolcall chunks in the WebUI #16597

Prerequisites

Feature Description

TL;DR

Summary

Rationale

Proposal

Benefits

Motivation

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Add a debug option to display OpenAI-Compatible toolcall chunks in the WebUI #16597

Description

Prerequisites

Feature Description

TL;DR

Summary

Rationale

Proposal

Benefits

Motivation

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions