Skip to content

Add thumbs up/down feedback to chatbot app#94

Merged
smurching merged 55 commits intodatabricks:mainfrom
smurching:add-thumbs-up-down-feedback
Feb 27, 2026
Merged

Add thumbs up/down feedback to chatbot app#94
smurching merged 55 commits intodatabricks:mainfrom
smurching:add-thumbs-up-down-feedback

Conversation

@smurching
Copy link
Collaborator

@smurching smurching commented Jan 27, 2026

Add thumbs up/down feedback with MLflow assessment integration

Summary

Adds a thumbs up/down feedback widget to assistant messages in the chatbot UI. When a user submits feedback, it is forwarded as an MLflow assessment on the corresponding trace, enabling quality tracking in the MLflow UI.

Works in both persistent mode (PostgreSQL) and ephemeral mode (no database required).

Key design decisions

  • MLflow as source of truth: Feedback state is loaded from MLflow on page load via GET /api/feedback/chat/:chatId. The in-memory store is a short-term cache (ephemeral mode + recent messages in DB mode), not the primary source.
  • Assessment deduplication: A second feedback submission PATCHes the existing assessment rather than creating a duplicate. Assessment IDs are cached in memory for the session, with a fallback GET to MLflow.
  • Graceful degradation: If MLflow is unavailable, feedback buttons still render but submissions silently fail without breaking the chat UI.
  • Trace ID capture: The trace ID is captured from raw SSE chunks in onChunk and stored alongside the message for feedback submission.

Changes

Client

  • client/src/components/message-actions.tsx — thumbs up/down buttons with green highlight on active feedback
  • client/src/hooks/useFeedback.ts — feedback submission + optimistic UI updates
  • client/src/hooks/useChatData.ts — loads feedback state from server on page load

Server

  • server/src/routes/feedback.ts — POST and GET endpoints for feedback
  • server/src/routes/chat.ts — captures trace ID from raw SSE chunks
  • server/src/lib/message-meta-store.ts — in-memory store for message metadata (best-effort, not reliable across restarts)

Database

  • packages/db/src/schema.ts — adds traceId column to messages table
  • packages/db/src/queries.ts — saves and retrieves trace ID

Tests

  • tests/routes/feedback.test.ts — feedback API tests
  • tests/routes/trace-id-capture.test.ts — end-to-end trace ID capture tests
  • tests/e2e/feedback.test.ts — browser-level feedback interaction tests

Test plan

  • TEST_MODE=ephemeral npm test passes (74 passed, 12 skipped)
  • Thumbs feedback buttons turn green on click
  • Page reload restores feedback state from MLflow
  • Second feedback click PATCHes the existing assessment

smurching and others added 27 commits January 18, 2026 17:28
This commit adds user feedback capabilities to the chatbot interface,
allowing users to rate assistant responses with thumbs up/down reactions.
Feedback is stored in the database and optionally sent to MLflow for
experiment tracking.

Changes include:
- Database schema: New Feedback table with foreign keys to Message, Chat, and User
- Database queries: CRUD operations for feedback (create, read, update, delete)
- Server route: /api/feedback endpoint with MLflow integration as side effect
- UI updates: Thumbs up/down buttons in message actions for assistant messages
- MLflow client: Async submission to MLflow assessments API
- Migration: SQL migration to add Feedback table
- Tests: Unit tests for feedback route endpoints
- Documentation: Environment variables for MLFLOW_EXPERIMENT_ID in .env.example and app.yaml

The MLflow integration is optional and configured via MLFLOW_EXPERIMENT_ID
environment variable. If not set, feedback is saved to database only.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Multiple fixes to get thumbs up/down feedback working:

1. Fix Message table reference
   - Updated schema.ts to use 'Message_v2' instead of 'Message'
   - Database has both tables; Message_v2 has parts/attachments columns

2. Fix foreign key constraints
   - Updated Feedback.messageId FK to point to Message_v2
   - Script: scripts/fix-feedback-fk.ts

3. Fix feedback POST route middleware
   - Removed requireChatAccess (expects :id param that doesn't exist)
   - Added manual checkChatAccess call in handler

4. Add automatic user creation
   - Created ensureUserExists() function
   - Called before creating feedback to satisfy FK constraint
   - Fixes issue where users weren't auto-created on first use

5. Load existing feedback on page load
   - Added GET /api/feedback/chat/:chatId endpoint
   - Extended useChatData hook to fetch feedback
   - Passes feedback state through component tree to MessageActions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1. Fix React.memo bug in Messages component
   - Add feedback prop comparison
   - Fix return value (true when props equal, not false)
   - This prevents constant re-renders that caused UI to disappear

2. Fix missing feedback field in useChatData hook
   - Add feedback: {} to 404 error path
   - Ensures consistent data structure

3. Add AGENTS.md with comprehensive UI development guide
   - React component patterns and memoization
   - API endpoint testing with curl examples
   - Common debugging patterns and solutions
   - Performance optimization techniques

Testing:
- Created test script verifying memo logic
- Confirmed all memo comparison cases work correctly
- Verified API data flow with curl

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Creates automated test script that validates the complete flow:
- Session verification
- Chat creation with streaming messages
- Chat metadata retrieval
- Message history fetching
- Feedback submission (thumbs up/down)
- Feedback updates
- Chat history listing

Features:
- Colored output for better readability
- Automatic test user generation
- Error handling and validation
- Browser URL generation for manual verification
- Follows patterns from AGENTS.md

Usage:
  ./scripts/test-e2e-flow.sh [user-id] [email]

Test results:
✅ All 7 tests passed successfully
- Created chat with streaming response
- Submitted and updated feedback
- Verified data persistence

This script can be used to validate changes before committing
and ensures the feedback feature works end-to-end.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Documents comprehensive testing results for feedback feature:
- All 7 test scenarios passed successfully
- Bug fixes applied (React.memo and data structure)
- API endpoints validated
- Database state verified
- Test artifacts and examples included

Provides clear record of:
- What was tested and how
- Issues found and resolved
- Remaining tasks
- How to run tests

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem:
- Page went blank when clicking thumbs up/down
- React error: "Rendered more hooks than during the previous render"
- Caused by early return (if isLoading) BEFORE all hooks were called

Root Cause:
The component had this structure:
1. Call 4 hooks
2. Early return if isLoading (line 52)
3. Call useCallback hook (line 70)

When isLoading=true: 4 hooks called
When isLoading=false: 5 hooks called
This violated React's Rules of Hooks.

Fix:
- Moved ALL hooks to the top (before any early returns)
- Converted handleCopy to useCallback for consistency
- Added useMemo for textFromParts to avoid recalculation
- Added initialFeedback comparison to memo function
- Added comment explaining the requirement

Hook call order is now consistent across all renders:
1. useCopyToClipboard
2. useState (feedback)
3. useState (isSubmittingFeedback)
4. useMemo (textFromParts)
5. useEffect
6. useCallback (handleFeedback)
7. useCallback (handleCopy)
8. Early return (after all hooks)

Testing:
- Frontend hot-reload should apply changes automatically
- No more hooks violation errors
- Feedback buttons should work without blanking page

Related: https://reactjs.org/link/rules-of-hooks

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Users get visual feedback from the button color change (green/red),
so the toast notification is redundant. Keeping error toast for
failed submissions since that's important feedback when something
goes wrong.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem:
When reloading the page on an existing chat, saw this error:
  TypeError: Cannot read properties of undefined (reading 'state')
  at Chat.makeRequest (@ai-sdk_react.js:2252:35)
  at async AbstractChat.resumeStream

Root Cause:
On page reload:
1. lastPartRef.current is undefined (no stream parts received yet)
2. onFinish callback is triggered (from AI SDK's automatic resume)
3. Logic sees lastPartRef === undefined and tries to resumeStream()
4. But there's no active stream to resume - it's just a page load
5. AI SDK's internal state is undefined, causing the error

Fix:
Check if streamCursor === 0 before attempting resume. If we haven't
received any stream parts in this session, there's nothing to resume.
This only affects actual stream interruptions, not page reloads.

Flow now:
- Page reload: streamCursor=0 → skip resume → fetch history
- Stream disconnect: streamCursor>0 → attempt resume
- Stream complete: lastPart.type='finish' → skip resume → fetch history

Testing:
1. Reload page on existing chat → no error
2. Submit feedback → feedback works, no error
3. Network disconnect during stream → resume works

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem:
- Network tab shows streaming happening token-by-token
- But UI doesn't show anything until entire message completes
- Then shows everything at once with feedback widget

Root Cause:
Same memo bug as before in message.tsx line 395:
- Memo function always returned false
- This disabled memoization and caused unnecessary re-renders
- Should return true when props are equal (skip re-render)

Also missing:
- initialFeedback comparison in memo function

Fix:
1. Changed final return from false to true
2. Added initialFeedback.feedbackType comparison
3. Now properly skips re-renders when props haven't changed
4. Allows re-renders when message.parts changes (streaming)

Expected behavior after fix:
✅ Tokens stream and display incrementally
✅ No feedback widget while streaming (isLoading=true hides it)
✅ Feedback widget appears when stream completes (isLoading=false)

This matches the behavior in Messages component we fixed earlier.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Added logging at three levels to track streaming:
1. ChatTransport.onStreamPart - logs when stream parts arrive
2. Chat component - logs when messages array updates
3. PreviewMessage - logs when message component renders

This will help identify where the bottleneck is:
- If only transport logs appear: AI SDK not updating messages
- If transport + chat logs appear: PreviewMessage not re-rendering
- If all logs appear frequently: Display/CSS issue

User reported only seeing one PreviewMessage log with 0 chars
even though streaming takes time, suggesting messages array
is not being updated during streaming.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Error: Cannot access 'messages' before initialization
Cause: useEffect referenced messages/status before useChat declared them

Fix: Moved useEffect to after useChat hook declaration.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This should make token-by-token streaming more responsive in the UI.
Lower throttle = more frequent UI updates = smoother streaming display.

Debug logs added in previous commits will help identify if:
- Stream parts are arriving ([ChatTransport] logs)
- Messages array is updating ([Chat] logs)
- Component is re-rendering ([PreviewMessage] logs)

These logs appear in browser console, not server logs.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Modified PreviewMessage memo to always re-render during streaming
- Removed debug logging from chat.tsx and message.tsx
- Updated TESTING_SUMMARY.md with streaming fix documentation

The issue was that the memo function blocked re-renders when isLoading=true,
preventing incremental token display. Now tokens stream smoothly and feedback
widget appears only after message completion.
Root cause: Messages component memo was blocking re-renders during streaming.
The AI SDK mutates the messages array in place, so deep equality checks
don't detect changes. This prevented PreviewMessage from receiving updated
props during streaming.

Changes:
- Modified Messages memo to always re-render when status is 'streaming'
- Added E2E testing documentation to AGENTS.md
- Created STREAMING_DEBUG_INVESTIGATION.md documenting the investigation

This ensures tokens appear incrementally during message generation while
maintaining performance optimization for static messages.
Resolved conflicts:
- client/src/components/chat.tsx: Kept both OAuth error check and streamCursor check
- client/tsconfig.json: Removed obsolete /tools path mapping
- client/src/components/messages.tsx: Removed duplicate sendMessage prop

Integrated upstream changes including OAuth error handling and package restructuring.
Removed TESTING_SUMMARY.md and STREAMING_DEBUG_INVESTIGATION.md to keep
PR focused on feature implementation. Core documentation in AGENTS.md
with E2E testing instructions is preserved.
…bricks_options.return_trace: true for serving endpoints in AI SDK provider

Signed-off-by: Sid Murching <sid.murching@databricks.com>
- Add feedback table to DB schema with feedbackType, mlflowAssessmentId
- Capture trace IDs from Databricks Responses API via databricksFetch layer
  (injects databricks_options.return_trace=true) and onChunk callback
- Store trace IDs in DB (traceId column on message) and in-memory fallback
  (message-meta-store.ts) for ephemeral mode
- POST feedback as MLflow assessment (api/3.0/mlflow/traces/{id}/assessments)
  with boolean feedback.value; skip gracefully if no trace ID
- Export getWorkspaceHostname() from ai-sdk-providers for use in feedback route
- Add MSW mock for MLflow assessments endpoint with body validation
- Add route tests: text streaming, trace ID capture, and feedback submission

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…source_id

- Extend message-meta-store.ts to store assessmentId per (messageId, userId)
  for ephemeral-mode deduplication
- Hoist existingFeedback lookup before MLflow call so DB-mode assessment ID
  is available for deduplication check
- PATCH /api/3.0/mlflow/traces/{id}/assessments/{assessmentId} when an
  assessment already exists; POST only for first submission
- Use session.user.email (falls back to id) as MLflow source_id
- Add MSW PATCH mock for MLflow assessments endpoint
- Add test: second feedback submission uses PATCH instead of POST

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…bricks/ai-sdk-provider

Instead of injecting databricks_options.return_trace in the custom databricksFetch
interceptor, forward it via providerOptions so the upstream provider handles it natively.
Capture the resulting trace_id via includeRawChunks/onChunk from the raw SSE events,
removing the global traceIdMap state and fallback retrieval logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…m git tracking

This directory was an embedded copy of the chatbot app template that should not
be tracked separately. The canonical copy lives in e2e-chatbot-app-next/.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The MLflow assessments PATCH API requires an update_mask field specifying
which fields to update. Without it the request fails with INVALID_PARAMETER_VALUE.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When API_PROXY is set, sends x-mlflow-return-trace-id: true as a
request header. The MLflow AgentServer appends a standalone SSE event
{ trace_id: "tr-..." } at the end of the stream, which onChunk now
captures alongside the existing Databricks serving endpoint path.

Also adds:
- routes-api-proxy Playwright project + second webServer on port 3003
  for testing the full API_PROXY trace-ID → feedback path
- Configurable Vite port/backend via PORT and BACKEND_URL env vars,
  enabling multiple local server pairs to run simultaneously
- 2 integration tests (trace-id-capture.api-proxy.test.ts) covering
  happy-path capture and PATCH deduplication

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Schema

The AI SDK's generateId produces 16-char alphanumeric IDs (nanoid), not
UUIDs. Assistant response messages receive nanoid IDs server-side, so
sending them back as previousMessages failed z.string().uuid() validation.
Relax the check to z.string() for previous message IDs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s/ai-sdk-provider

The upstream package was rebuilt (OpenResponses streaming support) and
dropped the extractApprovalStatus export. Remove the import and the
MCP denial short-circuit that depended on it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Sid Murching <sid.murching@databricks.com>
id,
messages: initialMessages,
experimental_throttle: 100,
experimental_throttle: 50, // Reduced from 100ms for smoother streaming display
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stick with 100 here, no need to make this change in this PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — already reverted to 100 in the current commits.

return;
}

// Don't try to resume if we haven't received any stream parts in this session
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guard prevents a premature resume when a stream attempt produced 0 parts (e.g., connection dropped before the first SSE event arrived). Without it, the client would immediately fire a GET /api/chat/:id/stream request even though there is nothing to resume. Keeping it for now — happy to remove if you prefer simpler logic.

let feedbackMap: Record<string, Feedback> = {};
try {
const feedbackResponse = await fetch(`/api/feedback/chat/${chatId}`, {
credentials: 'include',
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is credentials: 'include' needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed — these are same-origin requests via the Vite proxy so credentials: 'include' is redundant.

// Default response for non-MCP requests
// Default response: split text into per-word chunks to replicate real streaming
// (one response.output_text.delta per token). This is necessary to reproduce
// the delta-boundary bug where interleaved raw+text-delta chunks break streaming.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "bug" doesn't make sense out of context here, since it's been fixed. maybe just clarify that this helps validate that interleaved raw + text delta chunks produced by streamText can be parsed properly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment to remove the 'bug' framing.

* the Responses API returns multiple text-delta events (real Databricks behavior).
*
* The critical bug this catches: when raw chunks are interleaved with text-delta
* chunks in the provider stream, the delta-boundary transformer used to inject a
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this delta-boundary transfomer in the current codebase? I don't think it is, it's in databricks-ai-bridge right?. We can probably remove this test from here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed — you're right that the delta-boundary transformer belongs to databricks-ai-bridge. The basic SSE format is already exercised by the other route tests.

smurching and others added 2 commits February 24, 2026 23:17
- feedback.ts: Remove source from PATCH body (MLflow rejects source updates)
- chat.ts: Add originalMessages to createUIMessageStream for continuation support
- tsdown.config.ts: Bundle @databricks/ai-sdk-provider with server
- helpers.ts: Remove mcp_approval_response from mock output (it's input-only)
- tests/pages/chat.ts: Fix isGenerationComplete race condition via stop-button DOM check
- tests/e2e/oauth-error.test.ts: Fix response-listener registration order
- api-mock-handlers.ts: Add source-update rejection matching real MLflow behavior
- databricks.yml: Set default serving endpoint name
- vite.config.ts: parseInt -> Number.parseInt (Biome)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
smurching and others added 4 commits February 24, 2026 23:21
# Conflicts:
#	e2e-chatbot-app-next/packages/ai-sdk-providers/src/providers-server.ts
#	e2e-chatbot-app-next/server/src/index.ts
#	e2e-chatbot-app-next/tests/e2e/mcp-approval.test.ts
#	e2e-chatbot-app-next/tests/routes/context-injection.test.ts
- chat.ts: Remove console.warn/log; update trace-extraction comments
- feedback.ts: Rename getMessageMeta→getMessageMetadata; remove success log
- message-meta-store.ts: Rename export; add clarifying comment about best-effort semantics
- useChatData.ts: Remove redundant credentials:'include' (same-origin via Vite proxy)
- message.tsx: Revert isLoading memo guard to prevProps.isLoading !== nextProps.isLoading
- api-mock-handlers.ts: Update delta-boundary comment to remove 'bug' framing
- Delete databricks-fetch.test.ts (not meaningful for LLM endpoints)
- Delete text-streaming.test.ts (delta-boundary transformer lives in databricks-ai-bridge)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use message object reference check while loading instead of relying on
fast-deep-equal for parts. fast-deep-equal short-circuits on identical
references, so if the AI SDK mutates parts in place during streaming the
component never re-renders mid-stream and output appears all at once.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
await expect(page.getByTestId('multimodal-input')).toBeVisible();

// Send a message to create a chat
// Register listener before sending to avoid race condition where fast mock responses
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems complex, is it necessary / why? Same with e2e-chatbot-app-next/tests/pages/chat.ts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chat.ts change to isGenerationComplete() is needed: the original waitForResponse('/api/chat') was registered after sendUserMessage(), so fast MSW mock responses (which complete in ~0ms) could arrive before Playwright had set up the listener, causing the test to hang indefinitely waiting for a response that already finished. The DOM-based stop-button approach avoids this entirely — you can check DOM state at any time regardless of when the response arrived.

The oauth-error.test.ts change was more complex than necessary though. Since isGenerationComplete() already handles the race condition, the test just needs to call it directly. Simplified in the latest commit.

skipInEphemeralMode,
} from '../helpers';

test.describe('/api/feedback', () => {
Copy link
Collaborator Author

@smurching smurching Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a test for deduping updating existing feedback instead of adding new assessments here? If it's already covered elsewhere, I wonder if it'd make more sense to test it here instead (I would expect such a test to live under feedback.test.ts)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was covered in trace-id-capture.test.ts (where it ended up as a natural follow-on to the trace-capture test), but you're right that it belongs in feedback.test.ts. Moved it there in the latest commit and removed the duplicate from trace-id-capture.test.ts.

- oauth-error.test.ts: replace inline responsePromise approach with
  chatPage.isGenerationComplete() (the stop-button DOM method already
  handles the fast-mock race condition)
- feedback.test.ts: add deduplication test (second POST → PATCH)
- trace-id-capture.test.ts: remove duplicate dedup test (now lives
  in feedback.test.ts where it belongs)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
description: "Name of the model serving endpoint to be used by the app"
# TODO: uncomment the line below and specify a default value to avoid needing to specify it on each deployment
# default: "your-serving-endpoint-name-goes-here"
default: "agents_smurching-default-feb_2026_agent"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should revert back to the old value here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted in the latest commit.

"npm": ">=8.0.0"
}
},
"../../databricks-ai-bridge/integrations/ai-sdk-provider": {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to drop this change after releasing the AI SDK provider

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — will update this once databricks-ai-bridge PR #312 is merged and the package is published to npm.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment on lines +123 to +127
mlflowResponse = await fetch(
`${hostUrl}/api/3.0/mlflow/traces/${traceId}/assessments/${existingAssessmentId}`,
{
method: 'PATCH',
headers: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRY: those fetch could be abstracted away. BTW I think we had some kind of databricksFetch in the code somewhere that could be reused?

* In DB mode, it acts as a short-term cache to avoid extra DB lookups for
* recently streamed messages. Data is lost on server restart in either case.
*/
const store = new Map<string, { traceId: string | null; chatId: string }>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that this pattern isn't good since this is basically a large memory leak (we have no TTL/cleanup/eviction strategy), but is probably good enough as we keep only IDs/metadata here

return;
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(feedback to the LLM probably): there are some weird newline patterns at the end of the few files, it's good to keep it consistent (one newline at the end of each file)

// Wait for the assistant message to be rendered in the DOM before reading it.
// Even after generation completes, the React throttle may delay the final render.
await expect(page.getByTestId('message-assistant')).toBeVisible({
timeout: 10000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is such a big timeout needed here?

<Action
tooltip="Thumbs up"
onClick={() => handleFeedback('thumbs_up')}
className={feedback === 'thumbs_up' ? 'text-green-600' : ''}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smurching you can read more about how tailwind works here
https://tailwindcss.com/docs/styling-with-utility-classes

const [feedback, setFeedback] = useState<'thumbs_up' | 'thumbs_down' | null>(
initialFeedback?.feedbackType || null,
);
const isSubmittingRef = useRef(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also (or instead) use a stateful isSubmitting so the new action buttons will get proper loading state?

smurching and others added 8 commits February 26, 2026 16:52
Adds a test-only /api/test/store-message-meta endpoint and a feedback
test that verifies the server returns success:true (without
mlflowAssessmentId) when the message has no trace ID — i.e. when using
a foundation model endpoint like databricks-claude-sonnet-4-5 that uses
the FMAPI chat completions format.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both the message-meta store and assessment store now evict the oldest
entry (by insertion order) when they hit 10 000 entries (~10 MB). Uses
Map's guaranteed insertion-order iteration — no new dependencies.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Databricks Apps platform builds the server from source and needs to
resolve @databricks/ai-sdk-provider directly. Listing it only as a
transitive dep of @chat-template/ai-sdk-providers wasn't sufficient.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… no trace

- Add `traceId: string | null` to `CustomUIDataTypes` so the AI SDK
  can carry it as a `data-traceId` message part to the client.
- In chat.ts, manually drain the AI stream (instead of writer.merge)
  so we can append the `data-traceId` part after all model chunks are
  processed and the traceId from onChunk is known.
- In message-actions.tsx, read `data-traceId` from message.parts to
  determine feedback support; show disabled thumbs buttons with a
  "Feedback not available for this endpoint" tooltip when traceId is
  null (e.g. foundation model endpoints that don't emit traces).
- Update memo comparator to re-render when the traceId part arrives.
- Fix server/src/index.ts: move store-message-meta test endpoint
  outside the MSW try/catch so it's registered even if MSW setup fails.
- Update expectedSSE in tests to include the new data-traceId chunk.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Sid Murching <sid.murching@databricks.com>
Signed-off-by: Sid Murching <sid.murching@databricks.com>
@smurching smurching merged commit 68201a8 into databricks:main Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants