Skip to content

Add feedback configuration tooling and docs to chatbot template#142

Open
smurching wants to merge 26 commits intodatabricks:mainfrom
smurching:chatbot-feedback-tooling
Open

Add feedback configuration tooling and docs to chatbot template#142
smurching wants to merge 26 commits intodatabricks:mainfrom
smurching:chatbot-feedback-tooling

Conversation

@smurching
Copy link
Collaborator

@smurching smurching commented Feb 27, 2026

Summary

Follow-on to #140 — adds tooling and documentation to help users configure feedback for their deployed chatbot app, plus important bug fixes and DB-backed vote persistence discovered while dogfooding the feature end-to-end.

Feedback tooling:

  • New scripts/get-experiment-id.ts resolves the MLflow experiment name for any agent type (npx tsx scripts/get-experiment-id.ts --endpoint <name>)
  • databricks.yml and app.yaml include FEEDBACK RESOURCE comments pointing to the script
  • README gains a Feedback Collection section with step-by-step setup instructions
  • New Claude Code quickstart skill auto-configures the feedback widget during app setup

DB-backed vote persistence:

  • New Vote table in Postgres (composite PK on chatId + messageId, FK to Chat and Message)
  • POST /api/feedback writes the vote to DB (fire-and-forget) alongside the MLflow assessment
  • GET /api/feedback/chat/:chatId reads from DB in a single query instead of N MLflow trace fetches — O(1) page load regardless of conversation length

Bug fixes (found while dogfooding):

  • Assistant messages not saved to DB: createUIMessageStream uses a short non-UUID ID generator by default (e.g. Xt8nZiQRj1fS4yiU), but Message.id is typed as uuid in Postgres. Fixed by passing generateId: generateUUID to createUIMessageStream.
  • Server crash on DB error: In Node.js 22, an unhandled rejection inside onFinish crashes the server process. Wrapped saveMessages in try/catch so DB failures are logged rather than crashing. Added .databricks to .gitignore to prevent stale Terraform state from being accidentally committed.

Changes

  • scripts/get-experiment-id.ts (replaces .sh): TypeScript helper supporting two lookup modes:
    • --endpoint reads MONITOR_EXPERIMENT_ID tag from a serving endpoint, resolves to experiment name
    • --agent-brick lists all Agent Bricks tiles via GET /api/2.0/tiles and filters by name
  • databricks.yml: reverted to template code; FEEDBACK RESOURCE block with commented-out experiment resource; fixed Multi-Agent Supervisor docs link
  • app.yaml: reverted to template code; FEEDBACK RESOURCE comment for MLFLOW_EXPERIMENT_ID
  • README.md: new Feedback Collection section; deployment step updated to list both optional features with database requirement note
  • packages/db/src/schema.ts: new Vote table with (chatId, messageId) composite PK
  • packages/db/migrations/0002_polite_red_hulk.sql: migration adding the Vote table
  • packages/db/src/queries.ts: voteMessage() (upsert) and getVotesByChatId() helpers
  • server/src/routes/feedback.ts: POST writes vote to DB; GET reads from DB; removed retry loop and sleep(); removed Cache-Control: no-store
  • server/src/routes/chat.ts: pass generateId: generateUUID to createUIMessageStream; wrap onFinish in try/catch; removed traceId-before-error special case
  • packages/db/src/queries.ts: log postgres errors in saveMessages catch block
  • .gitignore: simplified .claude/* rules; add .databricks

Test plan

  • Send a message to an agent endpoint — user message and assistant reply both appear in DB (Message_v2 table)
  • Simulate a DB failure — server stays up and logs the error; streaming still completes successfully
  • npx tsx scripts/get-experiment-id.ts --endpoint <agent-endpoint> returns experiment name
  • npx tsx scripts/get-experiment-id.ts --agent-brick <name> works for KA and MAS
  • npx tsx scripts/get-experiment-id.ts --endpoint <foundation-model-endpoint> fails gracefully with helpful message
  • Set MLFLOW_EXPERIMENT_ID and send a message to a trace-emitting endpoint — thumbs feedback buttons appear and submit successfully
  • Submit thumbs-up, refresh page — thumbs-up is still highlighted (served from DB, single query)
  • Submit thumbs-down after thumbs-up — vote is updated; GET reflects the new value

Integration test: agent-openai-agents-sdk + this PR branch (dogfood, 2026-03-02)

Tested e2e-chatbot-app-next from this PR branch bundled inside agent-openai-agents-sdk, deployed to https://e2-dogfood.staging.cloud.databricks.com via databricks bundle deploy.

Ephemeral mode (no database resource):

  • GET /api/config{"features":{"chatHistory":false,"feedback":true}} — feedback enabled, no DB badge shown
  • Sent a chat message → agent responded; data-traceId part delivered in stream (tr-f1a5b69389d89a1707a7fb3fd63e389f)
  • POST /api/feedback (thumbs_up) → {"success":true,"mlflowAssessmentId":"a-f4a0e0a757574399aa78339f22c0338a"}
  • POST /api/feedback (thumbs_down, same message) → same mlflowAssessmentId returned — PATCH deduplication working ✓
  • MLflow trace confirmed user_feedback assessment with source_type: HUMAN, feedback.value: false (thumbs_down wins) ✓
  • GET /api/feedback/chat/:chatId returns {} — no DB, expected ✓
  • Server logs show [Chat] Running in ephemeral mode - no persistence; no crash ✓
  • Note: AI_TypeValidationError appears in logs when the MLflow AgentServer sends a raw {"trace_id":"..."} chunk through the Vercel AI SDK stream writer, resulting in finishReason: "error". Trace ID capture and feedback both work correctly end-to-end despite this. Pre-existing issue with agent-openai-agents-sdk backend format, not introduced by this PR.

With-database mode (Lakebase agent-backend-db-dogfood bound as app resource):

  • GET /api/config{"features":{"chatHistory":true,"feedback":true}}
  • DB migrations ran automatically during npm run build step ✓
  • POST /api/feedback (thumbs_up) → {"success":true,"mlflowAssessmentId":"a-bddb33fa18964d508b4fb0a3ce421d98"}
  • GET /api/feedback/chat/:chatId immediately returned the vote from DB — persistence confirmed ✓
  • Logs: [isDatabaseAvailable] Database available: true, OAuth service principal DB connection established; no errors ✓

🤖 Generated with Claude Code

smurching and others added 20 commits February 27, 2026 00:13
The thumbs up/down feedback widget is disabled by default because
it requires permissions that may not be available in all deployed
environments. Set FEEDBACK_ENABLED=true to enable it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the FEEDBACK_ENABLED env var approach with detection based on
the presence of MLFLOW_EXPERIMENT_ID, mirroring the ephemeral mode
pattern (PGDATABASE → chatHistory).

- config.ts: feedback = !!process.env.MLFLOW_EXPERIMENT_ID
- databricks.yml: add optional FEEDBACK RESOURCE (1)+(2) comments
  for mlflow_experiments + app experiment resource, parallel to the
  existing DATABASE RESOURCE pattern
- app.yaml: MLFLOW_EXPERIMENT_ID now uses valueFrom: experiment
- chat-header.tsx: add "Feedback disabled" badge (MessageSquareOff)
  alongside the existing "Ephemeral" badge; both are clickable links
  to the docs page
- .env.example: replace FEEDBACK_ENABLED with MLFLOW_EXPERIMENT_ID docs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Helper script that resolves the MLFLOW_EXPERIMENT_ID needed to enable
the feedback widget, supporting three sources:
  --endpoint  serving endpoint (reads MONITOR_EXPERIMENT_ID tag)
  --tile-id   Agent Bricks tile (GraphQL API; REST equivalent TBD)
  --app       Databricks App (reads mlflow_experiment resource)

Also fix databricks.yml validation warning (remove invalid top-level
mlflow_experiments key) and update app.yaml / databricks.yml comments
to point to the script.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GraphQL (/graphql) requires a browser CSRF token and doesn't work
with a CLI-issued token. /api/2.0/tiles/{id} is a proper REST
endpoint that accepts standard Bearer auth and returns the same
mlflow_experiment_id field.

Tested against dogfood:
  --endpoint agents_smurching-default-feb_2026_agent  ✓ returns experiment name
  --endpoint databricks-claude-sonnet-4-5             ✓ fails gracefully (no tag)
  --tile-id  9f3c2abd-b9c5-4d41-ab69-b6fc50df91d3    ✓ candy-bot MAS
  --app      agent-lc-ts-dev                          ✓ fails gracefully (no resource)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The actual API shape is .experiment.experiment_id (numeric),
not .mlflow_experiment.name. Resolve the ID to a name via
/api/2.0/mlflow/experiments/get.

Tested: agent-lc-ts-dev → /Users/sid.murching@databricks.com/[dev sid_murching] agent-langchain-ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add feedback to Features list
- Add Feedback Collection section (parallel to Database Modes)
  covering: get-experiment-id.sh, databricks.yml resource block,
  app.yaml env var, local dev via .env
- Update Deployment step 3 to mention both DATABASE RESOURCE and
  FEEDBACK RESOURCE optional sections

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, MLflow errors (e.g. PERMISSION_DENIED) were logged but the
endpoint still returned 200 success:true, so the client never showed a
toast. Now the MLflow status code is forwarded to the client, which
triggers the existing toast.error handler in message-actions.tsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nt resource schema

- Only show feedback buttons when both feedbackEnabled (experiment configured)
  and feedbackSupported (trace ID present) are true, instead of showing
  disabled greyed-out buttons when trace ID is absent
- Fix experiment resource in databricks.yml to use experiment_id (not name)
  and experiment key (not mlflow_experiment) per the DAB schema
- Update TODO comment to reflect correct field names

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…led responses

When the agent returns a rate-limit or other error, the AI SDK's onError
handler throws and stops reading the stream, so data-traceId:null (written
after the inner stream drains) was never processed on the client. This left
traceIdPart undefined, causing feedbackSupported=true and showing clickable
feedback buttons on empty/errored messages.

Fix: write data-traceId before forwarding any error chunk, so the client
processes it (setting feedbackSupported=false) before the SDK throws.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the serving endpoint has a MONITOR_EXPERIMENT_ID tag, quickstart.sh
now offers to enable the feedback widget (default: yes). On confirmation it:
- Sets MLFLOW_EXPERIMENT_ID in .env
- Uncomments and configures the experiment resource in databricks.yml
- Uncomments MLFLOW_EXPERIMENT_ID in app.yaml

Re-running the script skips this step if feedback is already configured.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Renames .claude/skills/enable-feedback/ → .claude/skills/quickstart/
- Updates .gitignore allowlist accordingly
- Rewrites skill content: primary flow is now ./scripts/quickstart.sh,
  which auto-configures feedback when the endpoint has a linked experiment
- Manual feedback setup steps updated with correct DAB schema
  (experiment: experiment_id: instead of mlflow_experiment: name:)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Feedback submission (trace assessments) requires write access to the
experiment. CAN_READ is insufficient — update all references across
databricks.yml, quickstart.sh, get-experiment-id.sh, SKILL.md, and README.

Also fix README example to use correct DAB schema (experiment: experiment_id:
instead of the old mlflow_experiment: name: field).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… gitignore

- Wrap saveMessages call in onFinish with try/catch so a DB failure
  no longer becomes an unhandled rejection that crashes Node.js 22
- Log the underlying postgres error in saveMessages catch block instead
  of silently swallowing it
- Add .databricks to .gitignore to prevent stale Terraform state from
  being carried over when copying the project directory

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Sid Murching <sid.murching@databricks.com>
The AI SDK's default ID generator produces short non-UUID strings
(e.g. "Xt8nZiQRj1fS4yiU") but the Message.id column is typed as
uuid in Postgres. This caused every assistant message insert to fail
with "invalid input syntax for type uuid".

Pass generateId: generateUUID so the SDK uses the same UUID generator
as the rest of the app.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Adds thumbs-up/down buttons to assistant messages
- Submits feedback to MLflow assessments API on click; persists across page reloads
- Reads assessments from GET /mlflow/traces/:traceId (embedded in trace object)
  instead of the standalone /assessments endpoint, which is not available in all workspaces
- In-memory message-meta store tracks traceIds and assessmentIds for ephemeral mode
- Adds Cache-Control: no-store to GET /api/feedback/chat/:chatId to prevent 304 stale responses
- Updates test mocks to match the trace-based assessments API shape
- Adds deployed feedback E2E test and Playwright deployed project config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace O(N) MLflow GET calls on page load with a single DB query.
Add ai_chatbot.Vote table (composite PK chatId+messageId, isUpvoted bool)
matching Vercel AI Chatbot's Vote_v2 schema. POST /api/feedback now
writes votes to DB (fire-and-forget, no traceId required). GET
/api/feedback/chat/:chatId reads from DB instead of N parallel MLflow
trace fetches. New with-db tests verify persistence and vote toggling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
#
# Usage:
# ./scripts/get-experiment-id.sh --endpoint <serving-endpoint-name>
# ./scripts/get-experiment-id.sh --tile-id <agent-bricks-tile-id>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of tile ID (which is not really a user-facing concept), we should ask for the agent brick name. I believe you can pass name to /api/2.0/tiles/get as well. "tiles" is not a user facing name, so use "agent brick name" instead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can you make this a typescript script? That way it can be run directly when the app template is cloned/copied from workspace filesystem, no need to add execute permissions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Renamed --tile-id to --agent-brick and changed the API call to use /api/2.0/tiles/get?name=<name> instead of the tile ID in the path. 'tiles' is no longer user-facing. The script is also now TypeScript (see next comment).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Converted to scripts/get-experiment-id.ts (deleted the .sh). Run with npx tsx scripts/get-experiment-id.ts --endpoint <name>. Updated README, quickstart.sh, and the quickstart skill doc to reference the new script.

headers: {
Authorization: `Bearer ${token}`,
'Content-Type': 'application/json',
// POST to create a new assessment, retrying on 404.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok to just keep the old logic here, no need to support retries

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Removed the retry loop — the POST to MLflow is now a single attempt.

assessmentId: assessment.assessment_id,
feedbackType: assessment.feedback?.value === true ? 'thumbs_up' : 'thumbs_down',
};
function sleep(ms: number): Promise<void> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sleep function needed? Can we remove it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Removed sleep along with the retry logic — it was only used in the retry loop.

while (true) {
const { done, value } = await reader.read();
if (done) break;
// Write traceId before any error chunk so the client receives it
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me what this was for? IIRC it was to ensure the feedback button shows up in case of stream error (?). It's actually ok not to show feedback in that case, I don't think it's super desirable to collect feedback on error responses (app creators should instead have separate instrumentation/alerting etc on that case)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: It was added to ensure the client receives the traceId data part even when the stream ends in an error, so the feedback buttons would still appear. Removed per your suggestion — it's fine not to show feedback on error responses.

'/chat/:chatId',
requireAuth,
async (req: Request, res: Response) => {
// Prevent browser caching so stale {} responses don't block fresh feedback data
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed now that we read from the DB?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Good catch — removed. The Cache-Control: no-store header was a workaround for 304/stale-cache issues with the old MLflow read path. DB reads are deterministic so it's no longer needed.


function getAuthToken(): string {
try {
const output = execSync('databricks auth token --output json', {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the pattern of other "deployed" tests?

Copy link
Collaborator Author

@smurching smurching Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine we could get an auth token instead using WorkspaceClient().config.authenticate() etc from Databricks TS SDK

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: This is the only deployed test so far, so there's no existing pattern to follow. Happy to restructure if you have a preferred approach.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: @databricks/sdk is not currently a project dependency, so I've kept the databricks auth token CLI approach which works without adding a new package. Happy to add the SDK if preferred.

Comment on lines +9 to +11
!.claude/skills/
.claude/skills/*
!.claude/skills/quickstart/
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be simplified? I think we just need the !.claude/skills/ right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Simplified to just .claude/* + \!.claude/skills/.

# FEEDBACK RESOURCE: Uncomment to enable the feedback UI (requires experiment resource in databricks.yml)
# - name: MLFLOW_EXPERIMENT_ID
# valueFrom: experiment
- name: MLFLOW_EXPERIMENT_ID
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should revert back to template code

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Reverted MLFLOW_EXPERIMENT_ID back to the commented-out FEEDBACK RESOURCE block.

description: "Name of the model serving endpoint to be used by the app"
# TODO: uncomment the line below and specify a default value to avoid needing to specify it on each deployment
# default: "your-serving-endpoint-name-goes-here"
default: "agents_main-default-smurching_feb_27_agent"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should revert back to template code

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Reverted to TODO comment style (no default value for serving_endpoint_name).

- name: serving-endpoint
description: "Databricks serving endpoint name for the AI agent"
# NOTE: If chatting with an Agent Bricks Multi-Agent Supervisor (https://docs.databricks.com/aws/en/generative-ai/agent-bricks/multi-agent-supervisor),
# NOTE: If chatting with an Agent Bricks Multi-Agent Supervisor (https://docs.databricks.com/aws/en/generative-ai/agent-framework/agent-bricks/multi-agent-supervisor),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should revert back to https://docs.databricks.com/aws/en/generative-ai/agent-bricks/multi-agent-supervisor, which is a working link, the new one here 404s

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Reverted to the working link: https://docs.databricks.com/aws/en/generative-ai/agent-bricks/multi-agent-supervisor.

# mlflow_experiment:
# name: "your-experiment-name"
# permission: CAN_READ
- name: database
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert back to template code

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Reverted database and experiment resources back to commented-out TODO blocks.

**Tip:** To automatically configure and deploy with database support, run `./scripts/quickstart.sh` and select "Yes" when prompted about enabling persistent chat history. See [Database Configuration](#database-modes) for details.
3. **Specify serving endpoint and address TODOs in databricks.yml**: Address the TODOs in `databricks.yml`, setting the default value of `serving_endpoint_name` to the name of the custom code agent or Agent Bricks endpoint to chat with. The optional commented-out sections allow you to enable:
- **Persistent chat history** — uncomment both `DATABASE RESOURCE` blocks to provision and bind a Lakebase database. See [Database Modes](#database-modes) for details. **Tip:** run `./scripts/quickstart.sh` to do this automatically.
- **User feedback collection** — uncomment the `FEEDBACK RESOURCE` block and set the experiment name. See [Feedback Collection](#feedback-collection) for details. **Tip:** run `./scripts/get-experiment-id.sh` to find the experiment name for your agent.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should note that this also requires a database, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Done. Added a note that feedback vote persistence (restoring thumbs state on page reload) requires a database, and that both DATABASE RESOURCE blocks must be uncommented.

smurching and others added 2 commits March 1, 2026 22:01
- Convert get-experiment-id.sh to TypeScript (get-experiment-id.ts)
- Rename --tile-id to --agent-brick; use /api/2.0/tiles/get?name= API
- Remove MLflow POST retry loop and sleep() helper in feedback.ts
- Remove Cache-Control: no-store header from GET /api/feedback/chat/:chatId
- Remove traceId-before-error-chunk special case in chat.ts stream drain
- Revert app.yaml and databricks.yml to template code (comment out personal config)
- Fix Multi-Agent Supervisor docs link in databricks.yml
- Simplify .gitignore .claude/* rules
- Note that feedback requires a database in README
- Biome lint auto-fixes (node: prefix, non-null assertion, unused vars)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- --agent-brick: list all tiles via GET /api/2.0/tiles and filter by name
  (tiles API has no name-based single-fetch endpoint)
- Remove --app mode (not needed)

Tested against dogfood:
- databricks-claude-sonnet-4-5: fails gracefully (no MONITOR_EXPERIMENT_ID)
- agents_smurching-default-feb_2026_agent: returns experiment name ✓
- candy-bot (MAS): returns experiment name ✓

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@smurching smurching requested a review from hubertzub-db March 2, 2026 06:15

const handleStreamComplete = useCallback(() => {
setStreamCompleted(true);
}, []);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get some more context on the logic?
Based on this code: after the new chat's streaming completes, we fetch the feedback for it.
What feedback is expected if messages are brand new?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Good question — yes, the expected feedback for a brand new chat is always {} (empty), since no votes have been submitted yet. The useChatData trigger after stream completion actually serves two purposes beyond just feedback:

  1. Chat metadata / sidebar: fetchChatData also calls GET /api/chat/:chatId and GET /api/messages/:chatId. We can't make these calls before the stream completes because the chat doesn't exist in the DB yet — it's created during the stream. After completion the chat is guaranteed to be persisted, so we fetch it then to update the sidebar.

  2. Feedback restoration on refresh: If the user submits feedback and then refreshes the page while still on the same new-chat URL, the useEffect in MessageActions (line 46) syncs the server-restored feedback into local state. Without the useChatData call, the feedback prop would stay {} forever and the thumbs wouldn't be highlighted after a refresh.

Within a session (no refresh), feedback state is tracked purely in local useState via the optimistic setFeedback(feedbackType) call in MessageActions — no server round-trip needed. The useChatData result only matters as the seed value (initialFeedback).

Happy to add a comment to the code clarifying this if that would help.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal in 1) makes sense, but it would be good to add a clarifying code comment that the goal of setting setStreamCompleted for a new chat is to trigger fetch of the chat metadata/sidebar. Is there not a cleaner/more direct way of doing this? it seems weird to set some stream completion state as a means to achieve chat sidebar metadata reload

For 2), I don't fully get it - how would a user submit feedback and refresh the page while on the new chat URL? Can they actually submit feedback while on the NewChatPage?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI reply: Good point — renamed streamCompleted to chatInitialized and added a comment that explains the actual purpose: the flag gates useChatData because the chat row doesn't exist in the DB until after the first exchange, so enabling it earlier would just produce a 404.

I considered an alternative where useChatData is always enabled and we call mutate() after the stream completes to revalidate, but that would fire an unnecessary request (that 404s) on every new chat page load before the first message is sent — so the flag guard still seems like the right approach.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I asked Claude for more details on point 2) and it replied with the following):

You're right to question it — the claim was wrong. Here's why:

When the user sends their first message from NewChatPage, softNavigateToChatId immediately updates the URL to /chat/:id via window.history.replaceState. So if the user refreshes
the page after sending a message, React Router routes them to ChatPage (not NewChatPage), and ChatPage has its own useChatData call that handles feedback restoration.

So useChatData in NewChatPage never actually restores feedback state on page refresh. Within the same session, feedback is already tracked by local useState in MessageActions
(optimistic update on submit), so the useChatData round-trip is also redundant there.

The useChatData call in NewChatPage is effectively only useful as an SWR cache pre-warm — if the data is fetched while the user is still looking at the new chat (URL already
showing /chat/:id), navigating away and back would hit ChatPage's useChatData which would find the SWR cache already populated. That's a minor incidental benefit.

In short: feedback={chatData?.feedback ?? {}} in NewChatPage is always going to be {} in practice (no votes exist on a brand-new chat), and the useChatData call could likely be
dropped entirely from NewChatPage with no visible change in behavior. The feedback prop there could just hardcode {}.

Will see if we can just remove this logic

smurching and others added 3 commits March 2, 2026 14:07
Adds a top-level '## Optional Chat UI Features' section summarising
user feedback and persistent chat history in one place. This gives
other agent templates (agent-openai-agents-sdk, agent-langgraph, etc.)
a clean anchor to link to. Also notes that agent templates with their
own databricks.yml experiment resource get feedback automatically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Clarify that the flag gates useChatData because the chat doesn't exist
in the DB yet (not just because streaming is done), and add a comment
explaining the purpose.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New chats never have prior votes, so fetching feedback after stream
completion was a no-op. On page refresh the user lands on ChatPage
(URL was already /chat/:id via replaceState), which has its own
useChatData. Remove the chatInitialized flag, useChatData call, and
the onStreamComplete prop from Chat — the sidebar refresh that
happens inside chat.tsx is unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- playwright.config.ts: in with-db mode, clear mock Databricks credentials
  (CLIENT_ID/SECRET/HOST) and pass DATABRICKS_CONFIG_PROFILE so the test
  server uses CLI auth for the real Lakebase DB. Mock OAuth credentials were
  forcing the OAuth path in ensureDb(), which called the mocked OIDC endpoint
  (returning 'test-token') — an invalid Lakebase password — causing all DB
  operations to fail with bad_request:database (400).

- server/src/routes/feedback.ts: await voteMessage() instead of fire-and-
  forget. Prevents a race where GET /api/feedback arrives before the DB write
  completes, which was causing flaky test failures.

- tests/routes/feedback.test.ts: add "GET returns empty map in ephemeral mode"
  test (Gaps 1/2 from coverage analysis); add skipInWithDatabaseMode import;
  capture chatId in no-traceId test; remove untenable DB assertion there
  (Vote FK requires messageId in Message table; real flow uses /api/chat).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants