Skip to content

WIP: add MCP Client with conversation tracking and chat UI#26343

Draft
pmbrull wants to merge 34 commits intomainfrom
task/implement-mcp-client-with-message-tracki-9d9ca8a4
Draft

WIP: add MCP Client with conversation tracking and chat UI#26343
pmbrull wants to merge 34 commits intomainfrom
task/implement-mcp-client-with-message-tracki-9d9ca8a4

Conversation

@pmbrull
Copy link
Copy Markdown
Collaborator

@pmbrull pmbrull commented Mar 9, 2026

Screen.Recording.2026-03-10.at.17.31.58.mov

Summary

  • Implements an MCP Client in OpenMetadata that allows users to interact with the existing MCP Server tools through a chat UI
  • Adds full conversation and message persistence with mcp_conversation and mcp_message tables
  • Supports configurable LLM providers (OpenAI, Anthropic) via mcpClientConfiguration in openmetadata.yaml
  • Provides REST API at /v1/mcp-client for chat, conversation CRUD, and message listing
  • Includes MUI-based chat page with conversation sidebar, markdown rendering, and tool call display

Changes

Backend

  • JSON Schemas: McpConversation, McpMessage, MessageBlock, ToolCall, TokenUsage, ChatContentType
  • SQL Migrations: mcp_conversation and mcp_message tables for MySQL and PostgreSQL with generated columns and indexes
  • Configuration: McpClientConfiguration with enabled, provider, apiKey, model, apiEndpoint, systemPrompt
  • LLM Client: Abstraction layer (LlmClient, LlmResponse, LlmMessage, LlmToolCall) with OpenAI and Anthropic implementations
  • DAOs: McpConversationDAO and McpMessageDAO in CollectionDAO
  • Repositories: McpConversationRepository and McpMessageRepository for CRUD operations
  • Service: McpClientService orchestrator with tool call loop (user message → LLM → tool calls → tool execution → LLM response)
  • REST API: McpClientResource with endpoints for chat, conversations, and messages
  • MCP Integration: Tool executor registration from openmetadata-mcp module via McpServer

Frontend

  • API Client: mcpClientAPI.ts with typed interfaces for all MCP client operations
  • Chat Page: McpChatPage with conversation sidebar, message list, chat input, and tool call accordion display
  • Routing: New /mcp-chat route with sidebar navigation entry
  • i18n: Labels and messages for MCP Chat UI

Test plan

  • Configure mcpClientConfiguration in openmetadata.yaml with a valid LLM provider and API key
  • Start OpenMetadata and verify the MCP Chat page is accessible via sidebar
  • Send a message and verify the LLM responds with tool calls executed against MCP Server tools
  • Verify conversation persistence — reload and check history is preserved
  • Test conversation CRUD: create, list, get with messages, delete
  • Verify MySQL and PostgreSQL migration scripts create tables correctly

Closes https://github.com/open-metadata/ai-platform/issues/429

pmbrull and others added 2 commits March 9, 2026 12:31
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement an MCP Client in OpenMetadata that allows users to interact
with the existing MCP Server tools through a chat UI, with full
conversation and message persistence.

Backend:
- JSON schemas for McpConversation, McpMessage, and content types
- SQL migrations for mcp_conversation and mcp_message tables (MySQL + PostgreSQL)
- McpClientConfiguration for LLM provider settings (OpenAI/Anthropic)
- LLM client abstraction with OpenAI and Anthropic implementations
- McpConversationRepository and McpMessageRepository for CRUD
- McpClientService orchestrator with tool call loop
- McpClientResource REST API at /v1/mcp-client
- MCP tool executor registration from openmetadata-mcp module

Frontend:
- mcpClientAPI.ts REST client for chat, conversations, and messages
- McpChatPage with conversation sidebar, message list, and chat input
- MUI-based components with markdown rendering and tool call display
- Route, sidebar navigation, and i18n labels

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 9, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 65%
65.74% (57389/87291) 45.25% (30202/66735) 48.24% (9080/18819)

pmbrull and others added 10 commits March 9, 2026 17:34
…eaming

Introduce the foundational types for server-sent event streaming in the
MCP chat feature. ChatEvent is a Java record with static factory methods
for each event type (conversation_created, text, tool_call_start,
tool_call_end, message_complete, error, done). ChatEventEmitter is a
functional interface consumed by the streaming chat method.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a new chatStream() method that mirrors the existing chat() logic
but emits ChatEvent events via a ChatEventEmitter callback instead of
accumulating results and returning a ChatResponse.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add POST /v1/mcp-client/chat/stream endpoint that returns text/event-stream
using Jersey StreamingOutput. Each ChatEvent is written as an SSE frame
with event name and JSON data. Includes writeSseEvent helper method and
proper error handling with Cache-Control and X-Accel-Buffering headers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ChatStreamEvent discriminated union type, ChatStreamCallbacks
interface, and streamChatMessage function that uses native fetch to
consume the /chat/stream SSE endpoint with proper auth and SSE parsing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace sendChatMessage with streamChatMessage for progressive UI
updates. Shows assistant text and tool calls incrementally as SSE
events arrive.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The method signature was missing the content parameter, causing
compilation errors in McpClientService.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Sidebar title: "MCP Chat" → "Chat"
- Empty state: clean centered heading + pill-shaped input with send icon
- ChatInput: replace separate Button with Send01 icon inside TextField
- Update i18n: mcp-chat-empty, mcp-chat-placeholder

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Human message bubbles: grey[200] bg with dark text (was unreadable
  white-on-primary-blue)
- Assistant bubbles: white bg with border (was grey[100])
- Trash icon: only visible on conversation hover
- Remove message count subtitle from conversation list
- Simplify token display color

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Show a CircularProgress spinner with "Thinking..." text when the
assistant message has no content yet (during streaming before the
first text or tool_call event arrives).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Create McpChatApplication as a separate installable app (marketplace only, not auto-installed)
- Move LLM config from openmetadata.yaml to app configuration (provided at install time)
- McpClientResource dynamically looks up McpChatApplication from ApplicationContext
- Simplify ToolExecutor interface (authorizer/limits captured in McpServer closure)
- Add McpChatPlugin for conditional Chat sidebar item (only shown when app is installed)
- Add UI application schema for install form (LLM provider, API key, model, endpoint, system prompt)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Verify MCP Chat nav item appears in sidebar when app is installed
- Test sidebar click navigates to /mcp-chat and renders chat UI
- Test send button enables with text input
- Uses proper fixtures, API-based app install/teardown, test.step()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CompletableFuture.runAsync calls had no error handling, silently
swallowing failures from repository calls after title generation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pmbrull pmbrull changed the title feat: add MCP Client with conversation tracking and chat UI WIP: add MCP Client with conversation tracking and chat UI Mar 12, 2026
@github-actions
Copy link
Copy Markdown
Contributor

TypeScript types have been updated based on the JSON schema changes in the PR

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

🟡 Playwright Results — all passed (2 flaky)

✅ 453 passed · ❌ 0 failed · 🟡 2 flaky · ⏭️ 2 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 453 0 2 2
🟡 2 flaky test(s) (passed on retry)
  • Pages/Customproperties-part1.spec.ts › sqlQuery shows scrollable CodeMirror container and no expand toggle (shard 1, 1 retry)
  • Pages/CustomThemeConfig.spec.ts › Update Hover and selected Color (shard 1, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Mar 12, 2026

🔍 CI failure analysis for a73638e: 1 test failure in integration tests caused by premature NDManager lifecycle closure in DjlEmbeddingClient, cascading to OpenSearch connection pool exhaustion and search index timeout. PR's new MCP/LLM client infrastructure increases concurrent resource demands, exposing improper resource cleanup timing.

Overview

Analyzed 53 logs across 43 error templates. CI pipeline shows a single critical test failure in integration tests (PostgreSQL + OpenSearch) with high confidence attribution to infrastructure resource management. The failure is a cascading consequence of DJL NDManager lifecycle mismanagement that degrades search indexing capabilities.

Failures

Search Index Entity Indexing Timeout (confidence: high)

  • Type: infrastructure
  • Affected jobs: integration-tests-postgres-opensearch
  • Related to PR: yes
  • Root cause: DjlEmbeddingClient prematurely closes NDManager resources before embedding operations complete (line 116). This triggers cascade failures: OpenSearch connection pool exhaustion ("Connection pool shut down"), vector service failures, and ultimately prevents search index updates within the 90-second timeout window.
  • Suggested fix:
    1. Review DjlEmbeddingClient.embedBatch() resource lifecycle—ensure NDManager/Predictor resources are scoped to live for the full duration of async embedding operations, not closed immediately after launch
    2. Implement proper resource pooling/caching for DJL models to eliminate repeated creation/destruction cycles
    3. Add synchronization around OpenSearch vector index updates to prevent connection pool saturation during embedding failures
    4. Consider adding backpressure/throttling to async vector embedding operations launched during entity creation, especially with new concurrent MCP/LLM client demands

Test Result

  • Test: DatabaseSchemaResourceIT.checkCreatedEntity
  • Error: "Condition with alias 'Wait for entity to appear in search index' didn't complete within 1 minutes 30 seconds... expected: but was: "
  • Test run summary: 10,857 tests run, 946 skipped, 1 failed

Summary

  • PR-related failures: 1 (indirect) — PR's new MCP client and LLM infrastructure adds concurrent resource demands that expose existing NDManager lifecycle bug in vector embedding service
  • Infrastructure/resource failures: 1 — DJL NDManager lifecycle mismanagement causing cascading OpenSearch connection pool and vector indexing failures
  • Recommended action: Prioritize fixing NDManager resource lifecycle in DjlEmbeddingClient. This is not a new bug introduced by the PR itself, but the PR's additional concurrency has surfaced it. Once fixed, integration tests should pass reliably.
Code Review ⚠️ Changes requested 15 resolved / 17 findings

Adds MCP Client with conversation tracking and chat UI, resolving 15 prior issues including null pointer exceptions, connection leaks, and unbounded message loading. However, handleSelectAll incorrectly replaces selected items instead of merging, and prefixCls="w-full" on antd Space breaks component styling—both must be fixed before merge.

⚠️ Bug: handleSelectAll replaces all selections instead of merging

📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:1

The new handleSelectAll function in AddTestCaseList replaces the entire selectedItems Map with only the currently-loaded items, discarding previously selected items. This is inconsistent with handleCardClick which correctly merges with existing selections using setSelectedItems((prevItems) => ...).

When selecting all: new Map(items.map(...)) only includes current items, losing any previously individually-selected items.
When deselecting: setSelectedItems(new Map()) wipes ALL selections, not just visible ones.

Suggested fix
In AddTestCaseList handleSelectAll, use functional state update:
setSelectedItems((prev) => {
  const next = new Map(prev);
  if (allCurrentlySelected) {
    items.forEach((item) => next.delete(item.id ?? ''));
  } else {
    items.forEach((t) => next.set(t.id ?? '', t));
  }
  onChange?.([...next.values()]);
  return next;
});
⚠️ Bug: prefixCls="w-full" on antd Space breaks component styling

📄 openmetadata-ui/src/main/resources/ui/src/rest/mcpClientAPI.ts:1

In AddTestCaseList.component.tsx (around line 201 in the diff), prefixCls="w-full" is added to an antd <Space> component. The prefixCls prop changes the CSS class prefix for antd's internal generated class names (default "ant"), changing classes like ant-space-item to w-full-item and completely breaking the component's layout and spacing. The desired full-width styling is already applied via className="w-full" on the same element. Remove the prefixCls prop.

✅ 15 resolved
Bug: MCP Client enabled flag is never checked; chat works with null API key

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/mcpclient/McpClientResource.java:92 📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:68 📄 openmetadata-service/src/main/java/org/openmetadata/service/clients/llm/LlmClientFactory.java:21
The McpClientConfiguration.enabled flag (default false) is never consulted. McpClientResource.initialize() always creates McpClientService, which always creates a real LLM client via LlmClientFactory.create(). When enabled=false and no API key is configured (the default deployment), any user hitting /v1/mcp-client/chat will trigger an HTTP request to OpenAI with Authorization: Bearer null, resulting in a confusing 401 error from the external API rather than a clear 'feature not enabled' response.

The initialize() method should skip service creation when mcpConfig.isEnabled() is false, and the endpoints should return HTTP 404 or 501 when the feature is disabled.

Bug: NullPointerException when toolExecutor is null and LLM returns tool calls

📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:137 📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:61
In McpClientService.chat() (line 137), toolExecutor.executeTool(...) is called without any null check. toolExecutor is a volatile field defaulting to null, only set asynchronously via setToolExecutor() after the MCP server registers. If the LLM returns tool calls before the tool executor is registered (or if registration fails silently, as warned on line 117 of McpServer), this will throw a NullPointerException in a request handler.

Currently this is partially mitigated by the fact that an empty toolDefinitions list means the LLM won't generate tool calls, but this coupling is fragile and not documented.

Security: LLM API key lacks serialization protection on config class

📄 openmetadata-service/src/main/java/org/openmetadata/service/config/McpClientConfiguration.java:28
The apiKey field in McpClientConfiguration has @JsonProperty("apiKey") but no @JsonIgnore for serialization. While no current endpoint directly returns this config, the field is fully accessible via OpenMetadataApplicationConfig.getMcpClientConfiguration().getApiKey(). If any config-dump, diagnostics endpoint, or error handler serializes the application config, the API key will be leaked in plaintext. This is a defense-in-depth concern.

Bug: Thread interrupt flag set incorrectly on IOException

📄 openmetadata-service/src/main/java/org/openmetadata/service/clients/llm/AnthropicLlmClient.java:77 📄 openmetadata-service/src/main/java/org/openmetadata/service/clients/llm/OpenAiLlmClient.java:74
Both AnthropicLlmClient and OpenAiLlmClient catch IOException | InterruptedException in a single handler and unconditionally call Thread.currentThread().interrupt(). When the caught exception is an IOException (not an interrupt), this incorrectly sets the thread's interrupt flag, which can cause downstream blocking operations (e.g., JDBC connections returning to pool) to throw unexpected InterruptedExceptions.

Edge Case: getConversationWithMessages truncates to 20 messages silently

📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:224 📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:261
In McpClientService.getConversationWithMessages() (line 224), messages are fetched with CONVERSATION_HISTORY_LIMIT = 20. This means the GET /conversations/{id} endpoint silently truncates conversations longer than 20 messages without informing the client. The UI fetchMessages uses the separate /messages endpoint with limit=50, so this mainly affects the direct conversation GET endpoint, but the same limit is used in buildLlmMessages() meaning the LLM loses context of conversations with more than 20 messages.

...and 10 more resolved from earlier reviews

🤖 Prompt for agents
Code Review: Adds MCP Client with conversation tracking and chat UI, resolving 15 prior issues including null pointer exceptions, connection leaks, and unbounded message loading. However, handleSelectAll incorrectly replaces selected items instead of merging, and prefixCls="w-full" on antd Space breaks component styling—both must be fixed before merge.

1. ⚠️ Bug: handleSelectAll replaces all selections instead of merging
   Files: openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:1

   The new `handleSelectAll` function in AddTestCaseList replaces the entire `selectedItems` Map with only the currently-loaded items, discarding previously selected items. This is inconsistent with `handleCardClick` which correctly merges with existing selections using `setSelectedItems((prevItems) => ...)`.
   
   When selecting all: `new Map(items.map(...))` only includes current items, losing any previously individually-selected items.
   When deselecting: `setSelectedItems(new Map())` wipes ALL selections, not just visible ones.

   Suggested fix:
   In AddTestCaseList handleSelectAll, use functional state update:
   setSelectedItems((prev) => {
     const next = new Map(prev);
     if (allCurrentlySelected) {
       items.forEach((item) => next.delete(item.id ?? ''));
     } else {
       items.forEach((t) => next.set(t.id ?? '', t));
     }
     onChange?.([...next.values()]);
     return next;
   });

2. ⚠️ Bug: prefixCls="w-full" on antd Space breaks component styling
   Files: openmetadata-ui/src/main/resources/ui/src/rest/mcpClientAPI.ts:1

   In `AddTestCaseList.component.tsx` (around line 201 in the diff), `prefixCls="w-full"` is added to an antd `<Space>` component. The `prefixCls` prop changes the CSS class prefix for antd's internal generated class names (default `"ant"`), changing classes like `ant-space-item` to `w-full-item` and completely breaking the component's layout and spacing. The desired full-width styling is already applied via `className="w-full"` on the same element. Remove the `prefixCls` prop.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant