Skip to content

Feature Request: Expose Model Internal Reasoning States (logits, attention, token-level trace)Β #5360

@tandenghui

Description

@tandenghui

Hi LocalAI team πŸ‘‹,

I'd like to request a feature that would greatly enhance the interpretability and debuggability of LocalAI models β€” the ability to expose the internal reasoning process during text generation.

Problem:
Currently, LocalAI only returns the final generated output (tokens or text), which limits insight into how the model arrived at its response.

There is no way to access intermediate model states such as:

logprobs or top-k token scores at each decoding step

attention weights per layer/head

hidden states / intermediate token embeddings

any kind of token-level reasoning trace

This makes it hard to:

debug model behavior

understand model uncertainty

build explainable AI systems (e.g. chain-of-thought visualization, step-by-step validation)

evaluate how model biases or hallucinations might arise

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions