fix: record debug runs for proposed chat titles#24820
Conversation
|
@codex review |
|
Codex Review: Didn't find any major issues. Bravo. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 83811489d1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
The refactoring is clean and structurally correct. Removing the enableDebug boolean and letting the runtime debug policy be the only gate is the right fix at the right level. The manualTitleResult struct is a proportional replacement for the 5-value return. Frontend invalidation scope is correct (only debug runs for proposal, not chat list/detail). All existing tests pass, and the new test covers both enabled and disabled debug paths.
"Boring code. No fight today. ♠" (Hisoka)
2 P3, 1 P4, 3 Nits. The P3s are about test coverage and documentation gaps, not code correctness. The core generation and commit paths are verified by the existing test suite.
🤖 This review was automatically generated with Coder Agents.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2c233c8e69
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
All six R1 findings verified as fixed. The refactoring is clean: enableDebug boolean removed, generateManualTitleCandidate documented with its no-usage-recording contract, failure-path debug finalization tested, double-accounting guard in place, type renamed, helper reused. The auto-promote test stabilization is a welcome bonus.
One new Nit on the insertUserTextMessage parameter name. Otherwise, the PR is ready.
"Boring code. No fight today. ♠" (Hisoka, again)
🤖 This review was automatically generated with Coder Agents.
|
@codex review |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Testing
go test ./coderd/x/chatd -run TestAutoPromote_InsertFailureSkipsStatusUpdate -count=20go test ./coderd/x/chatd -run 'TestAutoPromote_InsertFailureSkipsStatusUpdate|TestRegenerateChatTitle_PersistsAndBroadcasts|TestProposeChatTitle_DebugRun' -count=1go test ./coderd -run 'TestProposeChatTitle|TestRegenerateChatTitle' -count=1pnpm -C site test -- src/api/queries/chats.test.tspnpm -C site exec biome format --write src/api/queries/chats.ts src/api/queries/chats.test.ts src/pages/AgentsPage/AgentsPage.tsxpnpm -C site exec biome check --error-on-warnings src/api/queries/chats.ts src/api/queries/chats.test.ts src/pages/AgentsPage/AgentsPage.tsxpnpm -C site run lint:typesmake fmtmake lint📋 Implementation Plan
Plan: Unify manual chat title proposal and regeneration
Goal
Refactor manual chat title generation so the UI rename-dialog Generate flow
and the direct Regenerate title flow share the same internal generation
path, while preserving their different commit behavior:
proposegenerates a candidate title, records usage, returns the candidate,and does not persist the title.
regenerategenerates a candidate title through the same path, records usage,persists the generated title, publishes the title-change event when needed,
and returns the updated chat.
This should fix the current debug-panel gap where
/title/proposebypassesmanual title debug capture even though it performs the same model generation as
/title/regenerate.Evidence gathered
onPropose(chat.id)insite/src/pages/AgentsPage/components/Sidebar/RenameChatDialog.tsx.site/src/pages/AgentsPage/AgentsPage.tsximplements that asAPI.experimental.proposeChatTitle(chatId)viarequestProposeTitle.site/src/api/api.tsmaps proposal to/api/experimental/chats/${chatId}/title/proposeand regeneration to/api/experimental/chats/${chatId}/title/regenerate.coderd/exp_chats.go:proposeChatTitlecallsapi.chatDaemon.ProposeChatTitle.regenerateChatTitlecallsapi.chatDaemon.RegenerateChatTitle.coderd/x/chatd/chatd.go, both internal paths already sharefetchAndGenerateManualTitle, but with different debug behavior:proposeChatTitleWithStorecallsfetchAndGenerateManualTitle(ctx, store, chat, keys, false).regenerateChatTitleWithStorecallsfetchAndGenerateManualTitle(ctx, store, chat, keys, true).fetchAndGenerateManualTitleonly callsprepareManualTitleDebugRunwhenenableDebugis true, so/title/proposenever creates or finalizes atitle_generationdebug run.regenerateChatTitlefrontend cache logic already invalidates chat debug runsin
site/src/api/queries/chats.ts;proposeChatTitlecurrently bypasses thatmutation/query invalidation path.
Design direction
Use one shared lower-level generation helper, not a public-method call chain.
RegenerateChatTitleshould not literally callProposeChatTitle, because thepublic propose method already handles locking, usage recording, failure
accounting, and returns only a string. Calling it directly from regenerate would
risk double accounting, lock-boundary issues, and loss of model/usage metadata.
Instead, split the backend flow into two explicit phases:
Backend implementation plan
1. Introduce a small result type
In
coderd/x/chatd/chatd.go, add an internal result type near the manual titlehelpers:
Keep the type unexported and narrowly scoped to the manual title helpers.
2. Replace the boolean-gated helper with a shared candidate generator
Refactor
fetchAndGenerateManualTitleinto a helper with intent in the name,for example:
This helper should keep the existing behavior from
fetchAndGenerateManualTitle:p.checkUsageLimit.mergeManualTitleMessages.hasMessages=falsewhen there are no messages.p.resolveManualTitleModel.generateManualTitle.finishDebugRun(err)exactly once after generation, using the originalgeneration error before wrapping and before usage or persistence work.
generate manual title: %w.manualTitleGenerationErrorbehavior when usage is available, sofailure accounting through
recordManualTitleGenerationFailurestill works.recordManualTitleUsage; usage recording belongs only in theendpoint-specific commit path.
Remove the endpoint-controlled
enableDebugboolean. The only debug gate shouldbe the existing runtime policy:
This makes proposal and regeneration consistent while still respecting global,
admin, and user debug-log settings.
3. Keep
proposeChatTitleWithStoreas the non-persisting commit pathUpdate
proposeChatTitleWithStoreto callgenerateManualTitleCandidateandthen record usage without updating the title:
Preserve the existing response behavior: return only
result.titleand do notpersist the title or bump
updated_at.4. Keep
regenerateChatTitleWithStoreas the persisting commit pathUpdate
regenerateChatTitleWithStoreto call the same candidate helper and thenrecord usage while updating the title:
Preserve existing behavior:
codersdk.ChatWatchEventKindTitleChangeonly when the persisted titlediffers from the previous title.
5. Preserve outer locking and failure accounting
Do not move manual title locks out of the public methods unless tests reveal a
specific need. Preserve current boundaries:
ProposeChatTitleresolves keys, acquires/releases the manual title lock,delegates to
proposeChatTitleWithStore, and usesrecordManualTitleGenerationFailureon errors.RegenerateChatTitleresolves keys, acquires/releases the manual title lock,delegates to
regenerateChatTitleWithStore, and uses existing failurehandling.
This avoids changing concurrency semantics while still unifying generation.
Frontend implementation plan
1. Add a query-aware proposal mutation
In
site/src/api/queries/chats.ts, add a mutation helper for proposal, or routerequestProposeTitlethrough equivalent query logic:API.experimental.proposeChatTitle(chatId).chatDebugRunsKey(chatId)via the existing
invalidateChatDebugRunshelper. Failed title generations areoften the runs users need to inspect.
because proposal does not persist title changes.
2. Use the proposal mutation from
AgentsPageIn
site/src/pages/AgentsPage/AgentsPage.tsx, replace the directAPI.experimental.proposeChatTitle(chatId)call with the query-aware mutation.Keep the return value as
result.titleso the rename dialog behavior does notchange.
Tests
Backend tests
Update or add tests near the existing chat title tests:
coderd/exp_chats_test.goTestProposeChatTitle/DoesNotPersistTitleOrBumpUpdatedAtpassing.generation succeeds, if practical with existing helpers.
coderd/x/chatd/chatd_internal_test.goProposeChatTitlecreates/finalizes atitle_generationdebug run when debug logging is enabled.ProposeChatTitledoes not create adebug run when debug logging is disabled.
ChatWatchEventKindTitleChange.generation failure should finalize the debug run with failure status and
preserve
recordManualTitleGenerationFailurebehavior.path runs once per successful proposal or regeneration in mock-based tests.
Frontend tests
Update
site/src/api/queries/chats.test.ts:proposeChatTitlecache invalidation test mirroring the existingregenerateChatTitle invalidates debug runs so the title_generation run surfaces immediatelytest.Update Storybook interaction coverage only if UI behavior changes. The current
rename dialog UI should remain unchanged, so existing stories should still pass.
Validation commands
Run focused checks first:
Run frontend checks scoped to the touched query tests. Use the repository's
existing package scripts after confirming the exact command in
package.jsonorsite scripts, for example the relevant
pnpm/Vitest invocation forsite/src/api/queries/chats.test.ts.Then run required broader validation for touched code:
If Go or generated API types change, run the repository-required generation step
before linting. This refactor should not require database or SDK generation if
no API contracts change.
Dogfooding and quality gates
Backend quality gates
After backend tests pass:
Start the development environment with the repository-supported command:
Enable chat debug logging for the test user or deployment.
Create or open an agent chat with existing messages.
Use the rename dialog's Generate button.
Confirm the proposed title appears in the input and the persisted chat title
has not changed yet.
Open the debug panel and confirm a new
title_generationrun appears.Save the proposed title and confirm this save does not create a second title
generation run.
Use the direct Regenerate title action.
Confirm exactly one additional
title_generationrun appears and the visiblechat title updates.
Visual evidence for reviewers
Use
agent-browseror equivalent browser automation during dogfooding andcapture:
title_generationrun.and direct Regenerate.
Attach screenshots and recording to the final implementation summary or PR notes
if a PR is requested.
Advisor review incorporated
Advisor review agreed with the shared generation helper and separate commit
paths, with these refinements incorporated into this plan:
generation error, before wrapping and before usage or persistence work.
persistence failures should not retroactively change generation status.
/title/proposeafter removing theendpoint-level
enableDebugbypass.of this refactor.
Acceptance criteria
enableDebug=falsebypass for manual title modelcalls.
/title/proposerecords and finalizes debug runs when chat debug logging isenabled.
/title/regeneratecontinues to record and finalize debug runs when chat debuglogging is enabled.
/title/proposedoes not persist the generated title or bumpupdated_at./title/regeneratepersists the generated title and publishes a title-changeevent only when the title changes.
regeneration.
failure accounting.
can show the new run without a full page reload.
Risks and mitigations
passing an empty title to
recordManualTitleUsage; keep/update the existingDoesNotPersistTitleOrBumpUpdatedAttest.persistence; only commit helpers should call
recordManualTitleUsage.boundaries in this refactor.
debugSvc.IsEnabledas theonly debug capture gate and test both enabled and disabled behavior where
practical.
proposal flow, not just backend debug run creation.
runs represent model-generation status, not later usage-recording or
title-persistence status. Preserve current semantics by finalizing immediately
after generation with the original generation error.
/title/proposemeans rename-dialog Generate can store title-generationprompts and responses when debug logging is enabled. This is desired for
observability and consistent with regeneration, but proposal may be clicked
more often than direct regeneration.
Generated with
mux• Model:openai:gpt-5.5• Thinking:xhigh