fix(traces): cap backend trace Data to keep admin UI responsive#9960
Merged
Conversation
…nsive The previous fix (#9946) capped API trace bodies but missed backend traces, which carry the same blast radius: - LLM backend traces store the full chat messages JSON, full response, and full streaming deltas. Every agent-pool reasoning step ships the full RAG-augmented history (50-500 KiB per trace, often 100+ traces queued). - TTS / audio_transform / transcript traces embed a 30s audio snippet as base64, around 1.3 MiB per trace. Both blow the /api/backend-traces JSON past tens of MiB. The admin Traces page then keeps re-downloading and re-parsing the buffer faster than the 5s auto-refresh and stays in the loading state forever, the same symptom the API-side fix addressed. Apply two complementary caps, both honoring LOCALAI_TRACING_MAX_BODY_BYTES: Option A (safety net in core/trace): RecordBackendTrace walks the Data map recursively and replaces any string value larger than the cap with "<truncated: N bytes>". Catches anything a future producer forgets. Option B (head-preserving at the producer): - core/backend/llm.go: TruncateToBytes on messages, response, and chat_deltas content/reasoning_content so the leading content stays readable in the UI. - core/trace/audio_snippet.go: omit audio_wav_base64 when the encoded blob would exceed the cap (truncated base64 is undecodable). The quality metrics still ship and the UI's WaveformPlayer simply skips when the field is absent. TruncateToBytes is bounded to <= maxBytes so Option A leaves the producer's head-preserving output alone instead of replacing it with the bare marker. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7
…anels The setting was already plumbed through env (LOCALAI_TRACING_MAX_BODY_BYTES), CLI flag, and the runtime_settings.json GET/PUT schema, but neither the main Settings page nor the inline Traces panel offered an input for it. Admins hitting the "Traces UI stuck loading" symptom had to know to set an env var or PUT raw JSON to /api/settings to dial the cap. Add a "Max Body Bytes" row next to "Max Items" in both places. Same input type, same disabled-when-tracing-off semantics, placeholder shows the 65536 default so users see what they're inheriting. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7
…Bytes
The Tracing settings panel now has two number inputs. The previous spec
matched 'input[type="number"]' which became ambiguous and triggered a
Playwright strict-mode violation in CI. Switch to getByPlaceholder('100')
for Max Items and add a parallel spec for the new Max Body Bytes field
using getByPlaceholder('65536').
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/api/backend-tracesstill ships unboundedDatafields, so the admin Traces page stays in "loading" forever once a chatty agent-pool workload runs.messagesJSON, the fullresponse, and the full streamingchat_deltascontent. With RAG, every reasoning step queues 50-500 KiB; 100 entries trivially reach tens of MiB.audio_transform/ transcript traces embed a 30s audio snippet as base64 (~1.3 MiB per trace)./api/backend-tracesdownload + parse slower than the 5s auto-refresh, which is exactly the symptom the API-side fix addressed on the other end.This PR applies two complementary caps, both honoring
LOCALAI_TRACING_MAX_BODY_BYTES(same knob as the API-side cap):core/trace/backend_trace.go):RecordBackendTracewalks theDatamap recursively and replaces any string value larger than the cap with<truncated: N bytes>. Catches anything a future producer forgets to truncate.core/backend/llm.go:trace.TruncateToBytes(...)applied tomessages,response, andchat_deltas.{content,reasoning_content}. Keeps the leading content readable in the UI.core/trace/audio_snippet.go: omitaudio_wav_base64when the encoded blob would exceed the cap (truncated base64 is undecodable). The quality metrics still ship and the ReactWaveformPlayeralready no-ops when the field is absent.TruncateToBytesguarantees output<= maxBytes, so Option A leaves the producer's head-preserving output alone instead of replacing it with the bare marker.Test plan
go test ./core/trace/... ./core/backend/... ./core/http/middleware/...(3 packages, allok)RecordBackendTrace Data capping- top-level oversize, nested map oversize, untouched-when-small, no-double-truncateTruncateToBytes- happy path, disabled cap, oversize, pathologically small capAudioSnippetFromPCM byte cap- drop+keep metrics when oversize, include when under cap, include when cap disabledgo build ./core/trace/... ./core/backend/...)InitBackendTracingIfEnabled,AudioSnippet*) updatedlocal-ai-distributednamespace and confirm/api/backend-tracespayload stays under a few hundred KiB and the admin Traces page renders within the 5s auto-refresh window