WIP: add MCP Client with conversation tracking and chat UI by pmbrull · Pull Request #26343 · open-metadata/OpenMetadata

pmbrull · 2026-03-09T12:04:38Z

Screen.Recording.2026-03-10.at.17.31.58.mov

Summary

Implements an MCP Client in OpenMetadata that allows users to interact with the existing MCP Server tools through a chat UI
Adds full conversation and message persistence with mcp_conversation and mcp_message tables
Supports configurable LLM providers (OpenAI, Anthropic) via mcpClientConfiguration in openmetadata.yaml
Provides REST API at /v1/mcp-client for chat, conversation CRUD, and message listing
Includes MUI-based chat page with conversation sidebar, markdown rendering, and tool call display

Changes

Backend

JSON Schemas: McpConversation, McpMessage, MessageBlock, ToolCall, TokenUsage, ChatContentType
SQL Migrations: mcp_conversation and mcp_message tables for MySQL and PostgreSQL with generated columns and indexes
Configuration: McpClientConfiguration with enabled, provider, apiKey, model, apiEndpoint, systemPrompt
LLM Client: Abstraction layer (LlmClient, LlmResponse, LlmMessage, LlmToolCall) with OpenAI and Anthropic implementations
DAOs: McpConversationDAO and McpMessageDAO in CollectionDAO
Repositories: McpConversationRepository and McpMessageRepository for CRUD operations
Service: McpClientService orchestrator with tool call loop (user message → LLM → tool calls → tool execution → LLM response)
REST API: McpClientResource with endpoints for chat, conversations, and messages
MCP Integration: Tool executor registration from openmetadata-mcp module via McpServer

Frontend

API Client: mcpClientAPI.ts with typed interfaces for all MCP client operations
Chat Page: McpChatPage with conversation sidebar, message list, chat input, and tool call accordion display
Routing: New /mcp-chat route with sidebar navigation entry
i18n: Labels and messages for MCP Chat UI

Test plan

Configure mcpClientConfiguration in openmetadata.yaml with a valid LLM provider and API key
Start OpenMetadata and verify the MCP Chat page is accessible via sidebar
Send a message and verify the LLM responds with tool calls executed against MCP Server tools
Verify conversation persistence — reload and check history is preserved
Test conversation CRUD: create, list, get with messages, delete
Verify MySQL and PostgreSQL migration scripts create tables correctly

Closes https://github.com/open-metadata/ai-platform/issues/429

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implement an MCP Client in OpenMetadata that allows users to interact with the existing MCP Server tools through a chat UI, with full conversation and message persistence. Backend: - JSON schemas for McpConversation, McpMessage, and content types - SQL migrations for mcp_conversation and mcp_message tables (MySQL + PostgreSQL) - McpClientConfiguration for LLM provider settings (OpenAI/Anthropic) - LLM client abstraction with OpenAI and Anthropic implementations - McpConversationRepository and McpMessageRepository for CRUD - McpClientService orchestrator with tool call loop - McpClientResource REST API at /v1/mcp-client - MCP tool executor registration from openmetadata-mcp module Frontend: - mcpClientAPI.ts REST client for chat, conversations, and messages - McpChatPage with conversation sidebar, message list, and chat input - MUI-based components with markdown rendering and tool call display - Route, sidebar navigation, and i18n labels Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-09T12:38:43Z

Jest test Coverage

UI tests summary

Lines	Statements	Branches	Functions
	65.74% (57389/87291)	45.25% (30202/66735)	48.24% (9080/18819)

…eaming Introduce the foundational types for server-sent event streaming in the MCP chat feature. ChatEvent is a Java record with static factory methods for each event type (conversation_created, text, tool_call_start, tool_call_end, message_complete, error, done). ChatEventEmitter is a functional interface consumed by the streaming chat method. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add a new chatStream() method that mirrors the existing chat() logic but emits ChatEvent events via a ChatEventEmitter callback instead of accumulating results and returning a ChatResponse. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add POST /v1/mcp-client/chat/stream endpoint that returns text/event-stream using Jersey StreamingOutput. Each ChatEvent is written as an SSE frame with event name and JSON data. Includes writeSseEvent helper method and proper error handling with Cache-Control and X-Accel-Buffering headers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add ChatStreamEvent discriminated union type, ChatStreamCallbacks interface, and streamChatMessage function that uses native fetch to consume the /chat/stream SSE endpoint with proper auth and SSE parsing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace sendChatMessage with streamChatMessage for progressive UI updates. Shows assistant text and tool calls incrementally as SSE events arrive. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The method signature was missing the content parameter, causing compilation errors in McpClientService. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Sidebar title: "MCP Chat" → "Chat" - Empty state: clean centered heading + pill-shaped input with send icon - ChatInput: replace separate Button with Send01 icon inside TextField - Update i18n: mcp-chat-empty, mcp-chat-placeholder Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Human message bubbles: grey[200] bg with dark text (was unreadable white-on-primary-blue) - Assistant bubbles: white bg with border (was grey[100]) - Trash icon: only visible on conversation hover - Remove message count subtitle from conversation list - Simplify token display color Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Show a CircularProgress spinner with "Thinking..." text when the assistant message has no content yet (during streaming before the first text or tool_call event arrives). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Create McpChatApplication as a separate installable app (marketplace only, not auto-installed) - Move LLM config from openmetadata.yaml to app configuration (provided at install time) - McpClientResource dynamically looks up McpChatApplication from ApplicationContext - Simplify ToolExecutor interface (authorizer/limits captured in McpServer closure) - Add McpChatPlugin for conditional Chat sidebar item (only shown when app is installed) - Add UI application schema for install form (LLM provider, API key, model, endpoint, system prompt) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Verify MCP Chat nav item appears in sidebar when app is installed - Test sidebar click navigates to /mcp-chat and renders chat UI - Test send button enables with text input - Uses proper fixtures, API-based app install/teardown, test.step() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CompletableFuture.runAsync calls had no error handling, silently swallowing failures from repository calls after title generation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…i-9d9ca8a4

github-actions · 2026-03-12T05:59:19Z

TypeScript types have been updated based on the JSON schema changes in the PR

sonarqubecloud · 2026-03-12T06:24:32Z

Quality Gate passed for 'open-metadata-ui'

Issues
9 New issues
0 Accepted issues

Measures
2 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions · 2026-03-12T08:13:47Z

🟡 Playwright Results — all passed (2 flaky)

✅ 453 passed · ❌ 0 failed · 🟡 2 flaky · ⏭️ 2 skipped

Shard	Passed	Failed	Flaky	Skipped
🟡 Shard 1	453	0	2	2

🟡 2 flaky test(s) (passed on retry)

Pages/Customproperties-part1.spec.ts › sqlQuery shows scrollable CodeMirror container and no expand toggle (shard 1, 1 retry)
Pages/CustomThemeConfig.spec.ts › Update Hover and selected Color (shard 1, 1 retry)

📦 Download artifacts

How to debug locally

# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

…i-9d9ca8a4

gitar-bot · 2026-03-12T10:35:18Z

🔍 CI failure analysis for a73638e: 1 test failure in integration tests caused by premature NDManager lifecycle closure in DjlEmbeddingClient, cascading to OpenSearch connection pool exhaustion and search index timeout. PR's new MCP/LLM client infrastructure increases concurrent resource demands, exposing improper resource cleanup timing.

Overview

Analyzed 53 logs across 43 error templates. CI pipeline shows a single critical test failure in integration tests (PostgreSQL + OpenSearch) with high confidence attribution to infrastructure resource management. The failure is a cascading consequence of DJL NDManager lifecycle mismanagement that degrades search indexing capabilities.

Failures

Search Index Entity Indexing Timeout (confidence: high)

Type: infrastructure
Affected jobs: integration-tests-postgres-opensearch
Related to PR: yes
Root cause: DjlEmbeddingClient prematurely closes NDManager resources before embedding operations complete (line 116). This triggers cascade failures: OpenSearch connection pool exhaustion ("Connection pool shut down"), vector service failures, and ultimately prevents search index updates within the 90-second timeout window.
Suggested fix:
1. Review DjlEmbeddingClient.embedBatch() resource lifecycle—ensure NDManager/Predictor resources are scoped to live for the full duration of async embedding operations, not closed immediately after launch
2. Implement proper resource pooling/caching for DJL models to eliminate repeated creation/destruction cycles
3. Add synchronization around OpenSearch vector index updates to prevent connection pool saturation during embedding failures
4. Consider adding backpressure/throttling to async vector embedding operations launched during entity creation, especially with new concurrent MCP/LLM client demands

Test Result

Test: DatabaseSchemaResourceIT.checkCreatedEntity
Error: "Condition with alias 'Wait for entity to appear in search index' didn't complete within 1 minutes 30 seconds... expected: but was: "
Test run summary: 10,857 tests run, 946 skipped, 1 failed

Summary

PR-related failures: 1 (indirect) — PR's new MCP client and LLM infrastructure adds concurrent resource demands that expose existing NDManager lifecycle bug in vector embedding service
Infrastructure/resource failures: 1 — DJL NDManager lifecycle mismanagement causing cascading OpenSearch connection pool and vector indexing failures
Recommended action: Prioritize fixing NDManager resource lifecycle in DjlEmbeddingClient. This is not a new bug introduced by the PR itself, but the PR's additional concurrency has surfaced it. Once fixed, integration tests should pass reliably.

Code Review ⚠️ Changes requested 15 resolved / 17 findings

Adds MCP Client with conversation tracking and chat UI, resolving 15 prior issues including null pointer exceptions, connection leaks, and unbounded message loading. However, handleSelectAll incorrectly replaces selected items instead of merging, and prefixCls="w-full" on antd Space breaks component styling—both must be fixed before merge.

⚠️

Bug: handleSelectAll replaces all selections instead of merging

📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:1

The new handleSelectAll function in AddTestCaseList replaces the entire selectedItems Map with only the currently-loaded items, discarding previously selected items. This is inconsistent with handleCardClick which correctly merges with existing selections using setSelectedItems((prevItems) => ...).

When selecting all: new Map(items.map(...)) only includes current items, losing any previously individually-selected items.
When deselecting: setSelectedItems(new Map()) wipes ALL selections, not just visible ones.

Suggested fix

In AddTestCaseList handleSelectAll, use functional state update:
setSelectedItems((prev) => {
  const next = new Map(prev);
  if (allCurrentlySelected) {
    items.forEach((item) => next.delete(item.id ?? ''));
  } else {
    items.forEach((t) => next.set(t.id ?? '', t));
  }
  onChange?.([...next.values()]);
  return next;
});

⚠️

Bug: prefixCls="w-full" on antd Space breaks component styling

📄 openmetadata-ui/src/main/resources/ui/src/rest/mcpClientAPI.ts:1

In AddTestCaseList.component.tsx (around line 201 in the diff), prefixCls="w-full" is added to an antd <Space> component. The prefixCls prop changes the CSS class prefix for antd's internal generated class names (default "ant"), changing classes like ant-space-item to w-full-item and completely breaking the component's layout and spacing. The desired full-width styling is already applied via className="w-full" on the same element. Remove the prefixCls prop.

✅ 15 resolved

✅ Bug: MCP Client enabled flag is never checked; chat works with null API key

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/mcpclient/McpClientResource.java:92 📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:68 📄 openmetadata-service/src/main/java/org/openmetadata/service/clients/llm/LlmClientFactory.java:21
The McpClientConfiguration.enabled flag (default false) is never consulted. McpClientResource.initialize() always creates McpClientService, which always creates a real LLM client via LlmClientFactory.create(). When enabled=false and no API key is configured (the default deployment), any user hitting /v1/mcp-client/chat will trigger an HTTP request to OpenAI with Authorization: Bearer null, resulting in a confusing 401 error from the external API rather than a clear 'feature not enabled' response.

The initialize() method should skip service creation when mcpConfig.isEnabled() is false, and the endpoints should return HTTP 404 or 501 when the feature is disabled.

✅ Bug: NullPointerException when toolExecutor is null and LLM returns tool calls

📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:137 📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:61
In McpClientService.chat() (line 137), toolExecutor.executeTool(...) is called without any null check. toolExecutor is a volatile field defaulting to null, only set asynchronously via setToolExecutor() after the MCP server registers. If the LLM returns tool calls before the tool executor is registered (or if registration fails silently, as warned on line 117 of McpServer), this will throw a NullPointerException in a request handler.

Currently this is partially mitigated by the fact that an empty toolDefinitions list means the LLM won't generate tool calls, but this coupling is fragile and not documented.

✅ Security: LLM API key lacks serialization protection on config class

📄 openmetadata-service/src/main/java/org/openmetadata/service/config/McpClientConfiguration.java:28
The apiKey field in McpClientConfiguration has @JsonProperty("apiKey") but no @JsonIgnore for serialization. While no current endpoint directly returns this config, the field is fully accessible via OpenMetadataApplicationConfig.getMcpClientConfiguration().getApiKey(). If any config-dump, diagnostics endpoint, or error handler serializes the application config, the API key will be leaked in plaintext. This is a defense-in-depth concern.

✅ Bug: Thread interrupt flag set incorrectly on IOException

📄 openmetadata-service/src/main/java/org/openmetadata/service/clients/llm/AnthropicLlmClient.java:77 📄 openmetadata-service/src/main/java/org/openmetadata/service/clients/llm/OpenAiLlmClient.java:74
Both AnthropicLlmClient and OpenAiLlmClient catch IOException | InterruptedException in a single handler and unconditionally call Thread.currentThread().interrupt(). When the caught exception is an IOException (not an interrupt), this incorrectly sets the thread's interrupt flag, which can cause downstream blocking operations (e.g., JDBC connections returning to pool) to throw unexpected InterruptedExceptions.

✅ Edge Case: getConversationWithMessages truncates to 20 messages silently

📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:224 📄 openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:261
In McpClientService.getConversationWithMessages() (line 224), messages are fetched with CONVERSATION_HISTORY_LIMIT = 20. This means the GET /conversations/{id} endpoint silently truncates conversations longer than 20 messages without informing the client. The UI fetchMessages uses the separate /messages endpoint with limit=50, so this mainly affects the direct conversation GET endpoint, but the same limit is used in buildLlmMessages() meaning the LLM loses context of conversations with more than 20 messages.

...and 10 more resolved from earlier reviews

🤖 Prompt for agents

Code Review: Adds MCP Client with conversation tracking and chat UI, resolving 15 prior issues including null pointer exceptions, connection leaks, and unbounded message loading. However, handleSelectAll incorrectly replaces selected items instead of merging, and prefixCls="w-full" on antd Space breaks component styling—both must be fixed before merge.

1. ⚠️ Bug: handleSelectAll replaces all selections instead of merging
   Files: openmetadata-service/src/main/java/org/openmetadata/service/mcpclient/McpClientService.java:1

   The new `handleSelectAll` function in AddTestCaseList replaces the entire `selectedItems` Map with only the currently-loaded items, discarding previously selected items. This is inconsistent with `handleCardClick` which correctly merges with existing selections using `setSelectedItems((prevItems) => ...)`.
   
   When selecting all: `new Map(items.map(...))` only includes current items, losing any previously individually-selected items.
   When deselecting: `setSelectedItems(new Map())` wipes ALL selections, not just visible ones.

   Suggested fix:
   In AddTestCaseList handleSelectAll, use functional state update:
   setSelectedItems((prev) => {
     const next = new Map(prev);
     if (allCurrentlySelected) {
       items.forEach((item) => next.delete(item.id ?? ''));
     } else {
       items.forEach((t) => next.set(t.id ?? '', t));
     }
     onChange?.([...next.values()]);
     return next;
   });

2. ⚠️ Bug: prefixCls="w-full" on antd Space breaks component styling
   Files: openmetadata-ui/src/main/resources/ui/src/rest/mcpClientAPI.ts:1

   In `AddTestCaseList.component.tsx` (around line 201 in the diff), `prefixCls="w-full"` is added to an antd `<Space>` component. The `prefixCls` prop changes the CSS class prefix for antd's internal generated class names (default `"ant"`), changing classes like `ant-space-item` to `w-full-item` and completely breaking the component's layout and spacing. The desired full-width styling is already applied via `className="w-full"` on the same element. Remove the `prefixCls` prop.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

`Auto-apply`	`Compact`
`gitar auto-apply:on`	`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

sonarqubecloud · 2026-03-12T11:32:07Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

pmbrull and others added 2 commits March 9, 2026 12:31

docs: add implementation plan for MCP Client with message tracking

ba6d136

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

pmbrull requested a review from a team as a code owner March 9, 2026 12:04

pmbrull temporarily deployed to test March 9, 2026 12:04 — with GitHub Actions Inactive

pmbrull had a problem deploying to test March 9, 2026 12:04 — with GitHub Actions Failure

pmbrull temporarily deployed to test March 9, 2026 12:04 — with GitHub Actions Inactive

github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Mar 9, 2026