feat(observability): tag every router log with session_key + add turn-flow I/O logs#244
Merged
Merged
Conversation
… without LOG_LEVEL=debug
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Every router log line in
internal/proxy/is now tagged withsession_key,request_id,api_key_id, andingress. One Cloud Logging filter onjsonPayload.session_key=…will surface the full trace of a session through the router — including planner, scorer, pin lookup, handover, and dispatch — instead of needing to grep across unrelated lines.How it works
observability.WithLogger(ctx, log)/observability.FromContext(ctx)ininternal/observability/logger.go. Drops a*slog.Loggeronto the request context;FromContextreturns it (or the global default if nothing's set). ExistingGet()andFromGin()still work.bindRequestLoggerhelper (internal/proxy/session_key.go) derives the session key once and returns a context carrying a logger pre-bound with session_key/request_id/api_key_id/ingress.ProxyMessages,ProxyOpenAIChatCompletion, andProxyGeminiGenerateContent— right after envelope parse, before any routing logic.internal/proxy/*fromobservability.Get()toobservability.FromContext(ctx)everywhere ctx was already in scope. A few helpers (writeNewPin,refreshPin,enqueuePinUpsert,logPlannerOutcome,handleForceModelCommand,handleToolCallLoopBreak,handleNoProgressBreak) gained a leadingctxparam so they inherit the bound logger.Turn-flow I/O logs (Debug)
Added structured Debug logs at the points where turn flow makes decisions, so a session's path can be reconstructed from logs alone:
Proxy{Messages,OpenAIChatCompletion,GeminiGenerateContent} startwith requested model, stream flag, message count, has_tools, token estimate, and a 200-char prompt preview.runTurnLoop: turn-type classification, pin lookup hit/miss (with model/provider/reason/age), scorer decision, and tier-clamp events.logInboundToolTraffic: dumps the trailing 5 assistant tool_use names + 160-char arg previews when tools are present, so a misbehaving turn can be correlated back to the prior tool_use/tool_result shape without dumping the full body.Full upstream bodies remain behind `LOG_LEVEL=debug` via the existing `logUpstreamBody`.
What this does NOT do (follow-ups)
Test plan