Skip to content

feat(appkit): scan upward from preferred port in development#349

Merged
MarioCadenas merged 2 commits intomainfrom
mario/dev-dynamic-listen-port
May 6, 2026
Merged

feat(appkit): scan upward from preferred port in development#349
MarioCadenas merged 2 commits intomainfrom
mario/dev-dynamic-listen-port

Conversation

@MarioCadenas
Copy link
Copy Markdown
Collaborator

Summary

  • In NODE_ENV=development, the server plugin now picks the first free TCP port by scanning upward from the preferred port (server({ port }), DATABRICKS_APP_PORT, or default 8000) using get-port's portNumbers(from, to) helper, bounded to 100 attempts.
  • Avoids EADDRINUSE failures when running multiple AppKit dev servers locally (e.g. two playgrounds at once).
  • Production / non-development behavior is unchanged: the plugin binds the configured / env / default port and lets listen errors surface as before.
  • Startup log already prints the bound URL via the existing Server running on http://… line; no extra fallback log is emitted.

Implementation

  • New private ServerPlugin.resolveListenPort() returns:
    • In development: await getPort({ host, port: portNumbers(requested, requested + 99) }).
    • Otherwise: the requested port (no scan).
  • start() awaits the resolved port before app.listen(...).
  • logStartupInfo() reads the resolved port so logs match the actual socket.
  • Adds get-port@7.2.0 to @databricks/appkit dependencies.

Test plan

  • pnpm --filter @databricks/appkit typecheck
  • pnpm exec vitest run packages/appkit/src/plugins/server/tests/server.test.ts packages/appkit/src/plugins/server/tests/server.integration.test.ts (37 tests pass; new dev/prod cases cover the resolution branches via a partial get-port mock)
  • pnpm exec biome check --write on touched files
  • Manual: start two dev playgrounds back-to-back; confirm second one binds 8001 (or next free port) and the URL line reflects it

When NODE_ENV is development, the server plugin now picks the first
available TCP port starting from the preferred port (server({ port }),
DATABRICKS_APP_PORT, or default 8000) and scanning upward using
get-port's portNumbers() helper (bounded to 100 ports). This avoids
EADDRINUSE failures when running multiple AppKit dev servers locally.

Production / non-development behavior is unchanged: the plugin binds
the configured/env/default port and surfaces listen errors as before.

Logging now reflects the actually bound port on the existing
"Server running on http://..." line; no extra fallback line is emitted.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
@MarioCadenas MarioCadenas requested a review from a team as a code owner May 6, 2026 10:13
@MarioCadenas MarioCadenas requested a review from atilafassina May 6, 2026 10:13
Comment thread packages/appkit/src/plugins/server/index.ts
Address review feedback on #349: when the preferred dev listen port is
busy and the server binds a different port from the scan range, emit a
single info log "Port {requested} was busy, picking {bound}". Silent
when the preferred port was free.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
@MarioCadenas MarioCadenas merged commit 12598dc into main May 6, 2026
8 checks passed
@MarioCadenas MarioCadenas deleted the mario/dev-dynamic-listen-port branch May 6, 2026 11:01
MarioCadenas added a commit that referenced this pull request May 7, 2026
Biome import collapsing on agent loader, run-agent, and tests after
rebasing onto main. Lockfile and synced plugin manifest reflect the
current main state (including get-port from #349 already on main).

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
MarioCadenas added a commit that referenced this pull request May 7, 2026
Biome import collapsing on agent loader, run-agent, and tests after
rebasing onto main. Lockfile and synced plugin manifest reflect the
current main state (including get-port from #349 already on main).

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
MarioCadenas added a commit that referenced this pull request May 8, 2026
…agents (#304)

* feat(appkit): agents() plugin, createAgent(def), and markdown-driven agents

The main product layer. Turns an AppKit app into an AI-agent host with
markdown-driven agent discovery, code-defined agents, sub-agents, and
a standalone run-without-HTTP executor.

Agent runtime files land in core/agent/ from day one:

  core/agent/create-agent.ts    — createAgent() definition factory
  core/agent/run-agent.ts       — standalone adapter loop (no HTTP)
  core/agent/load-agents.ts     — markdown agent discovery
  core/agent/system-prompt.ts   — base system prompt + composition
  core/agent/types.ts           — updated with AgentDefinition,
                                   AgentsPluginConfig, RegisteredAgent, etc.

HTTP-facing concerns stay in plugins/agents/:
  agents.ts, thread-store.ts, tool-approval-gate.ts,
  event-channel.ts, event-translator.ts, schemas.ts,
  defaults.ts, manifest.json

* refactor(appkit): generalize default base system prompt

Tool-agnostic guidelines instead of SQL/files-specific defaults; accept full
PromptContext in buildBaseSystemPrompt for parity with custom callbacks.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

* feat(appkit): optional serving_endpoint on agents plugin manifest

Register DATABRICKS_SERVING_ENDPOINT_NAME as optional CAN_QUERY so apps using
Databricks-hosted agent models get resource wiring; optional when agents use
only external adapters. Sync template/appkit.plugins.json.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

* fix(appkit): agents manifest uses DATABRICKS_AGENT_ENDPOINT

Align optional serving resource with `DatabricksAdapter.fromModelServing()`, which
reads `DATABRICKS_AGENT_ENDPOINT` — not `DATABRICKS_SERVING_ENDPOINT_NAME`
(serving plugin). Sync template.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

* feat(agents): folder-based markdown discovery (<id>/agent.md)

Top-level config/agents/*.md is no longer loaded. Use
<agentId>/agent.md. The skills directory name is reserved and skipped.
Orphan top-level .md files error at load; subdirs without agent.md
error.

Export agentIdFromMarkdownPath for path-based id resolution.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

* refactor(appkit): promote MCP client + host policy to connectors/mcp

The MCP transport client and host policy aren't agents-specific; they are
HTTP + JSON-RPC transport with URL/DNS allowlisting. Move them under
packages/appkit/src/connectors/mcp/ so they sit alongside the other
transport-layer modules (serving, genie, sql-warehouse, lakebase, …) and
stop being reachable only through the agents plugin.

- Move mcp-client.ts          -> connectors/mcp/client.ts
- Move mcp-host-policy.ts     -> connectors/mcp/host-policy.ts
- Move McpEndpointConfig type -> connectors/mcp/types.ts
- Add connectors/mcp/index.ts barrel; re-export from connectors/index.ts
- Move mcp-client / mcp-host-policy tests to connectors/mcp/tests/
- Agents plugin keeps hosted-tools.ts (HostedTool sugar + resolve) and
  imports connector types from ../../connectors/mcp.
- tools/ barrel no longer re-exports AppKitMcpClient (never was public).

No behaviour change. All existing tests pass against the new paths.

* refactor(appkit): extract normalizeToolResult, consumeAdapterStream, dispatchToolCall

Three small helpers pulled out of the AgentsPlugin streaming path to cut
duplication and shrink the two large methods.

- normalize-result.ts: void->"", JSON-stringify, 50K truncation with a
  human-readable marker. Unit-testable (previously covered only via the
  HTTP path).
- consume-adapter-stream.ts: the 'message_delta' + 'message' accumulation
  loop shared between _streamAgent and runSubAgent. Accepts an optional
  signal and per-event side-effect callback (for SSE translation).
- tool-dispatch.ts: one place that fans out toolkit/function/mcp/subagent
  entries. 'never'-typed default forces exhaustiveness: adding a fifth
  source is now a compile error at every call site.

_streamAgent: executeTool closure shrinks from ~60 lines of dispatch +
normalize to a single dispatchToolCall + normalizeToolResult call.
Stream consumption collapses to consumeAdapterStream.

runSubAgent: childExecute shrinks from ~30 lines of if/else dispatch to
one dispatchToolCall call. Adapter loop collapses to consumeAdapterStream.

Behaviour change (minor): childExecute previously silently fell through to
'Unsupported sub-agent tool source' when mcpClient or PluginContext was
missing; now it throws the same specific error as the main stream. Matches
the main-path behaviour.

Tests: 15 new unit tests for normalizeToolResult + consumeAdapterStream.
dispatchToolCall is exercised transitively through the full agent suite
(288 existing tests still pass, 303 total on this branch).

* fix(agents): propagate tool annotations through tool() → FunctionTool → def

The `annotations` field (notably `destructive: true`) was silently dropped
as tools flowed from `tool({...})` into the resolved `AgentToolDefinition`,
so user-defined destructive tools never triggered the approval gate.

- `ToolConfig` now accepts `annotations?: ToolAnnotations`.
- `tool()` forwards it to the returned `FunctionTool`.
- `FunctionTool` exposes `annotations` and `functionToolToDefinition`
  preserves it on the definition it builds.
- `AgentsPlugin` reads the flag via `isDestructiveToolEntry()` (falls back
  to `functionTool.annotations` so a future divergence between def and
  function cannot re-introduce the bug) and emits the merged annotations
  via `combinedToolAnnotations()` on the `approval_pending` SSE payload.

Covered by `tests/tool-approval-gate.test.ts` and
`tests/function-tool.test.ts`.

* feat(agents): semantic ToolEffect — write/update/destructive tiers

ToolAnnotations.destructive is binary and has started to mislead:
"save_view" captures a screenshot and creates a new file, which is
nothing like deleting a dashboard, yet both trip the same red
"destructive" approval card. This adds a semantic `effect` enum with
four tiers — `read`, `write`, `update`, `destructive` — so tool
authors can tell the UI what blast radius they actually have. The
approval gate fires for any mutating effect (`write`/`update`/
`destructive`) and continues to honour the legacy `destructive: true`
flag so existing tools keep their current red treatment without
migration. Callers consuming `annotations` over the wire (MCP clients,
approval UIs) can now differentiate; the playground will ship a
tiered approval card as a follow-up.

* chore(appkit): post-rebase formatting and lockfile sync

Biome import collapsing on agent loader, run-agent, and tests after
rebasing onto main. Lockfile and synced plugin manifest reflect the
current main state (including get-port from #349 already on main).

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

* fix(appkit): apply review feedback to agents plugin

PR #304 agentic review applied (P1 + cheap P2):

- Approval gate now honours the modern `effect` field (write/update/
  destructive), matching the documented contract on ToolAnnotations.
  Previously a tool authored with `effect: "destructive"` and no legacy
  `destructive: true` boolean bypassed the gate.
- Sub-agent tool calls share the parent's RunState, so the per-run
  tool-call budget and the destructive-tool approval gate apply to
  nested sub-agent calls — not only the top-level adapter.
- /invocations enforces `maxConcurrentStreamsPerUser`. Without this a
  client could bypass the cap by switching from /chat to /invocations.
- /cancel uses a Zod schema instead of a raw `as` cast, matching the
  validation pattern of the sibling routes.
- agents.reload() builds the registry into a fresh Map and only swaps
  on success. A malformed markdown file no longer wipes the live
  registry and breaks in-flight requests.
- consumeAdapterStream and normalizeToolResult are wired into the four
  call sites that previously inlined the same accumulation /
  serialization logic (top-level _streamAgent, runSubAgent,
  dispatchToolCall, runAgent).
- load-agents.ts uses fs.promises so reload() does not block the event
  loop.
- Pin js-yaml to 4.1.1 (drop the caret), matching the rest of appkit's
  pinned deps.

Tests cover the approval-gate `effect` matrix (destructive/write/update
trigger; read/undefined skip), the deny path, the shared budget across
top-level + sub-agent dispatches, and the /invocations rate-limit gate.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

* fix(appkit): forward sub-agent events into the parent SSE stream

After dispatchToolCall consolidated the sub-agent path, runSubAgent
called consumeAdapterStream without forwarding the child's adapter
events. The parent stream therefore only emitted the outer agent-<name>
function call; nested tool_call / tool_result events from the sub-agent
never reached the client.

The smart-dashboard query agent delegates to dashboard_pilot whose UI-
action tools (apply_filter, highlight_period, focus_chart, etc.) rely on
the SSE stream to apply React state mutations. Without the forwarding,
the user asks for a highlight and nothing visible happens.

Forward every sub-agent event except metadata. Sub-agents have their
own threadId; emitting it would overwrite the parent's thread state on
the client and break multi-turn continuity.

Test exercises the metadata-skipping rule and asserts tool_call,
tool_result, and message_delta all reach outboundEvents.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

* fix(appkit): apply remaining review feedback (timeout config + result string)

Address the two PR-304 P2 findings deferred from the first round:

#8 — Hardcoded 30s timeout in PluginContext.executeTool truncated
legitimate cold SQL Warehouse and long Genie tool calls. Add
agents({ limits: { toolCallTimeoutMs } }) (default 5 minutes), plumb
through RunState into PluginContext.executeTool which now accepts an
optional timeoutMs (also defaulting to 300_000 for non-agents callers).

#9 — normalizeToolResult returned the raw object for short non-string
results and a string for truncated results. Every downstream consumer
stringified at the wire boundary anyway, so the asymmetry just complicated
the type signature. Always return string and JSON-stringify null/objects;
the existing defensive typeof === "string" ? : JSON.stringify(...) at
adapter and translator sites still handles non-AppKit-mediated results.

#11 (printRegistry uses console.log) is left as-is — intentional
picocolors-styled startup banner; logger prefix would break the column
alignment and bypass log-level filtering.

Tests cover the new toolCallTimeoutMs default (300_000), the override
path, the runState → context plumbing, and the always-string contract on
normalizeToolResult including the null → "null" change.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

---------

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Co-authored-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants