[KYUUBI #7379][2b/4] Data Agent Engine: agent runtime, middleware stack, and OpenAI provider#7417
[KYUUBI #7379][2b/4] Data Agent Engine: agent runtime, middleware stack, and OpenAI provider#7417wangzhigang1999 wants to merge 2 commits intoapache:masterfrom
Conversation
c534fd7 to
3011909
Compare
…re stack, OpenAI provider, and live E2E tests
This PR delivers the runtime layer of the Data Agent Engine on top of the tool
system and data source plumbing from 2a/4:
- ReactAgent: ReAct-style loop with streaming LLM responses, per-step tool
dispatch, and AgentRunContext tracking token usage, iterations, and session.
- Middleware stack (AgentMiddleware + ReactAgent.Builder):
* LoggingMiddleware -- structured per-step/LLM/tool/finish logs with MDC.
* ApprovalMiddleware -- CompletableFuture-based resolve for DESTRUCTIVE
tools; modes NORMAL / STRICT / AUTO_APPROVE.
* CompactionMiddleware -- token-threshold-triggered history summarization
with KEEP_RECENT_TURNS=4, emits a Compaction AgentEvent so clients can
observe the mechanism firing.
* ToolResultOffloadMiddleware -- spills large tool outputs to disk and
surfaces `read_tool_output` / `grep_tool_output` companion tools for the
LLM to re-query truncated previews.
- OpenAiProvider: single shared ReactAgent, per-session ConversationMemory,
streaming chat completions, Hikari-pooled JDBC data source; reads model and
thresholds from KyuubiConf.
- ExecuteStatement (Scala): encodes all AgentEvents (including compaction and
approval_request) as SSE JSON rows streamed through the JDBC reply column.
- KyuubiConf: new keys for LLM provider/api-url/model/api-key, approval mode,
compaction trigger tokens, offload root/thresholds, max iterations, etc.
- Tests:
* Unit tests for runtime, middlewares, offload store, and event shapes.
* Live tests gated on DATA_AGENT_LLM_API_KEY covering full LLM round-trips:
ReactAgentLiveTest (offload+grep, approval approve/deny), DataAgentE2ESuite
and DataAgentApprovalE2ESuite (JDBC layer), DataAgentCompactionE2ESuite
(JDBC-observable compaction event + post-compaction recovery),
CompactionMiddlewareLiveTest.
* Compatibility verified against qwen3.6-plus, glm-5, and kimi-k2.5 via
per-call `model=` logging in ReactAgent.
3011909 to
ce4eecc
Compare
MySQL Connector/J is GPL-licensed and cannot be bundled in an Apache binary release. Users who need the MySQL/StarRocks datasource at runtime should provide the driver jar themselves on the engine classpath. Addresses review feedback on apache#7417.
Evidence: runtime under real workload
TL;DR
Setup
ResultsOverall: 500 questions, By difficulty:
By database (sorted by EX):
Cost: ~45 min wall time at concurrency=8, ~21M tokens total. What this evidence supports
Scope disclaimers
Follow-up: Spark backend runSame harness, same 500 BIRD questions, but the agent targets a real EMR Kyuubi + Spark 3.5.3 cluster via
This setup is materially stricter than the official BIRD evaluation. BIRD pins the target What this run confirms for the PR:
|
ReactAgent Execution Flow |
|
Hi @pan3793, when you have time, could I ask for a review on this one? 🙏 Third PR of the Data Agent Engine series (umbrella #7379, labeled 2b/4) — adds the It's on the larger side (~5.3k lines, almost all under No rush — thanks! |
Why are the changes needed?
Part 2b of 4 for the Data Agent Engine (umbrella, KPIP-7373).
This PR adds the ReAct agent runtime that drives the LLM <-> tool loop, a composable middleware stack around it, and a production
OpenAiProvider. It sits on top of the tool system and data source abstraction introduced in PR 2a, and is consumed by the REST layer in PR 3.Changes include:
ReactAgent— ReAct loop with streaming, tool-call dispatch, turn budget, malformed-tool-call recoveryConversationMemory— message history with cumulative prompt-token trackingAgentRunContext/AgentInvocation/ApprovalMode— per-run state plumbingToolOutputStore— size-gated tool-output offload, keyed by session+call-id, withReadToolOutputTool/GrepToolOutputToolfor LLM-driven retrievalAgentMiddlewareinterface withonRegisterhook for tool wiring, plus four middlewares:LoggingMiddleware— structured request/response loggingApprovalMiddleware— risk-level-based approval gateCompactionMiddleware— token-threshold-driven history summarization keyed by sessionToolResultOffloadMiddleware— transparently owns theToolOutputStoreand registers retrieval toolsOpenAiProvider— OpenAI-compatible chat completions with streaming and tool callsExecuteStatement.scala— SSE encoding extended to emitCompactioneventsdatasource.dialectpackage for organizationkyuubi.engine.data.agent.compaction.trigger.tokensconfiguration entryMockLlmProvider— deterministic mock for middleware and runtime testsmysql-connector-jmoved totestscope (GPL-licensed; cannot be bundled in an Apache binary release — addresses review feedback on [KYUUBI #7379][2b/4] Data Agent Engine: agent runtime, middleware stack, and OpenAI provider #7417)How was this patch tested?
ConversationMemoryTest,ToolOutputStoreTest,ApprovalMiddlewareTest,CompactionMiddlewareTest,ToolResultOffloadMiddlewareTest,event/EventTest, plus updates toToolRegistryThreadSafetyTest/ToolTest/RunSelectQueryToolTest/RunMutationQueryToolTest/JdbcDialectTest/ MySQLDialectTestDATA_AGENT_LLM_API_KEY/DATA_AGENT_LLM_API_URL/DATA_AGENT_LLM_MODEL):ReactAgentLiveTest,CompactionMiddlewareLiveTest— exercise the full loop against a real OpenAI-compatible endpointDataAgentE2ESuiteextended with OpenAI-provider paths; newDataAgentCompactionE2ESuiteobserves compaction via JDBCWas this patch authored or co-authored using generative AI tooling?
Partially assisted by Claude Code (Claude Opus 4.7) for test generation, code review, and PR formatting. Core design and implementation are human-authored.