Skip to content

doc: design spec for azd ai agent toolbox direct commands#8160

Open
hund030 wants to merge 5 commits into
mainfrom
zhihuan/toolbox-command-design-spec
Open

doc: design spec for azd ai agent toolbox direct commands#8160
hund030 wants to merge 5 commits into
mainfrom
zhihuan/toolbox-command-design-spec

Conversation

@hund030
Copy link
Copy Markdown
Collaborator

@hund030 hund030 commented May 13, 2026

Summary

This PR adds the design spec for the azd ai agent toolbox direct commands. This is for design review only and contains no implementation code.

What's in this spec

  • Five top-level verbs for managing versioned toolboxes against a Foundry project:

    • toolbox create <name>: register a toolbox (pending tools until first connection add).
    • toolbox update <name> --default-version <n>: retarget the active version.
    • toolbox delete <name>: delete the whole toolbox or a single version via --version.
    • toolbox show <name>: display metadata, version body, and the toolbox's MCP endpoint URL for runtime consumption.
    • toolbox list: list toolboxes plus any local pending records for the resolved endpoint.
  • Connection subgroup (connection add | remove | list) that wraps the toolbox versions API. Every tool mutation is a full-tools[] POST to /toolboxes/{name}/versions followed by a PATCH default_version; tool-entry shape is inferred from the project connection's ARM category (RemoteToolmcp, CognitiveSearchazure_ai_search).

  • Tag subgroup (tag set | remove | list) shipped as a Compatibility stub: toolboxes are not yet exposed as ARM resources, so the standard ARM resource-tag surface is unavailable. The CLI surface is final and activates without surface changes once the underlying API exists.

  • Pending-toolbox config store: create cannot POST because the service requires a non-empty tools[], so it records a local entry under extensions.ai-agents.pending-toolboxes.<endpointHash>.items.<name>. The first connection add promotes the record to a real v1.

  • Live-verified API surface (§ 4): every endpoint the spec relies on was probed against a live Foundry project, including the per-version delete guard, the MCP runtime endpoint, and the ARM connection lookup used by connection add.

Key design decisions for review

  1. Extension placement (§ 3): toolbox commands live inside the existing azure.ai.agents extension at azd ai agent toolbox …. Files are isolated under internal/cmd/toolbox*.go and internal/pkg/toolbox/ so a future move to a standalone azure.ai.toolbox extension is a registration-only change.
  2. create does not POST (§ 5.1, Open Question 1): the service requires a non-empty tools[]. Two options were considered — auto-seed with a sentinel built-in, or defer the first POST until the first connection add. Current proposal: defer (pending-record model). Keeps create non-network and matches the "one command, one action" principle.
  3. update is --default-version only (§ 5.2): the service only accepts default_version on PATCH. Description and tool edits must go through connection add / connection remove, which publish a new version.
  4. Per-version delete is exposed (§ 5.3): the service refuses to delete the default_version when sibling versions exist (400). When the default is the only remaining version the service deletes it and cascades to remove the parent toolbox; the CLI rejects this without --force to avoid surprise destruction.
  5. Tools support both built-ins and connection-backed entries (§ 4.2, § 2 Non-Goals): the service accepts code_interpreter, web_search, file_search, mcp, and azure_ai_search side-by-side. Per issue Feature: Add ai.toolbox commands for CRUD on Toolboxes in Foundry Project #8143, built-ins are wired on the agent rather than via toolboxes; the CLI does not author built-ins but carries them through unchanged during fetch-merge-POST.
  6. No -p short form (§ 5.1): -p is already taken on agent run / agent invoke. The canonical name is --project-endpoint, matching the project-context spec.

Related

  • Feature request: #8143 Feature: Add ai.toolbox commands for CRUD on Toolboxes in Foundry Project
  • Sibling design specs:
    • #8138 docs: design spec for azd ai connection direct commands
    • #8152 doc: design spec for azd ai project context commands and endpoint resolution — the 5-level endpoint resolution cascade referenced in § 6 is owned by this spec.
  • Feature spec: azd ai Direct Commands (CLI Surface for Foundry)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

📋 Prioritization Note

Thanks for the contribution! The linked issue isn't in the current milestone yet.
Review may take a bit longer — reach out to @rajeshkamal5050 or @kristenwomack if you'd like to discuss prioritization.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new design specification document describing the proposed azd ai agent toolbox direct-command surface in the azure.ai.agents extension (design review only; no implementation).

Changes:

  • Documents the planned toolbox CRUD command surface (create|update|delete|show|list) plus connection and tag subgroups.
  • Specifies data-plane API endpoints, request/response shapes, and CLI behaviors (including a local pending-toolbox record model).
  • Outlines endpoint resolution, config storage shape, telemetry events, error codes, and a test plan.
Comments suppressed due to low confidence (1)

cli/azd/docs/design/azure-ai-toolbox-direct-commands.md:166

  • The spec’s computed MCP consumption URL omits the toolbox version ({projectEndpoint}/toolboxes/{name}/mcp?...). In the existing extension code, the MCP endpoint is version-scoped: /toolboxes/{name}/versions/{version}/mcp?api-version=v1 (see extensions/azure.ai.agents/internal/cmd/listen.go). If the service contract is versioned, this spec should reflect that so show returns a runnable URL.
1. `GET /toolboxes/{name}` for `default_version`.
2. `GET /toolboxes/{name}/versions/{version}` (or `default_version` when `--version` is absent) for the body.
3. Compute the toolbox's MCP consumption URL as `{projectEndpoint}/toolboxes/{name}/mcp?api-version=v1`.
4. Output:

Comment thread cli/azd/docs/design/azure-ai-toolbox-direct-commands.md
Comment thread cli/azd/docs/design/azure-ai-toolbox-direct-commands.md Outdated
Comment on lines +19 to +22
- The eleven verbs listed in § 1.
- Cross-cutting flags (`--output table|json`, `--no-prompt`, `--debug`, `--project-endpoint`) on every new command.
- A self-contained data-plane client at `internal/pkg/toolbox/`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hund030 Reuse the existing one if you can for now.

Comment thread cli/azd/docs/design/azure-ai-toolbox-direct-commands.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.

Comment thread cli/azd/docs/design/azure-ai-toolbox-direct-commands.md Outdated
Comment thread cli/azd/docs/design/azure-ai-toolbox-direct-commands.md
Comment thread cli/azd/docs/design/azure-ai-toolbox-direct-commands.md Outdated
Comment thread cli/azd/docs/design/azure-ai-toolbox-direct-commands.md
Comment thread cli/azd/docs/design/azure-ai-toolbox-direct-commands.md Outdated
Comment on lines +19 to +22
- The eleven verbs listed in § 1.
- Cross-cutting flags (`--output table|json`, `--no-prompt`, `--debug`, `--project-endpoint`) on every new command.
- A self-contained data-plane client at `internal/pkg/toolbox/`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hund030 Reuse the existing one if you can for now.


## 13. Open Questions

1. Should `create` also accept an optional `--connection <name>` flag to publish v1 immediately, skipping the pending-record step? Current proposal: no — the pending-record model keeps `create` free of toolbox-write calls (it only reads `GET /toolboxes/{name}` to decide between the registered / existing one-liner) and matches the "one command, one action" principle from the umbrella spec.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

## 13. Open Questions

1. Should `create` also accept an optional `--connection <name>` flag to publish v1 immediately, skipping the pending-record step? Current proposal: no — the pending-record model keeps `create` free of toolbox-write calls (it only reads `GET /toolboxes/{name}` to decide between the registered / existing one-liner) and matches the "one command, one action" principle from the umbrella spec.
2. Should `toolbox list` surface every version's tool count, or just the default-version's count? Current proposal: just the default — matches what `show` displays by default and keeps the list call to a single GET per toolbox.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default.


1. Should `create` also accept an optional `--connection <name>` flag to publish v1 immediately, skipping the pending-record step? Current proposal: no — the pending-record model keeps `create` free of toolbox-write calls (it only reads `GET /toolboxes/{name}` to decide between the registered / existing one-liner) and matches the "one command, one action" principle from the umbrella spec.
2. Should `toolbox list` surface every version's tool count, or just the default-version's count? Current proposal: just the default — matches what `show` displays by default and keeps the list call to a single GET per toolbox.
3. Should `connection add` accept `--as <alias>` to decouple the tool entry's `name` from the connection name? Current proposal: no — defer until users ask.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

Copy link
Copy Markdown
Member

@jongio jongio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid spec with clear API mapping, good test plan, and well-structured command surface. A few design gaps that'll bite during implementation:

Error code overloading: CodeInvalidToolbox is used for 6+ distinct failure modes (duplicate connection, unsupported category, empty tools, missing --index, missing connection, invalid version target). This makes telemetry useless for distinguishing root causes. Consider at least: CodeDuplicateConnection, CodeUnsupportedConnectionCategory, CodeEmptyToolsArray, CodeMissingIndex.

TOCTOU in shared write flow: Steps 2-5 (fetch default version, mutate, POST, PATCH) have a race window. Two concurrent connection add calls both fetch the same version, both POST, and the second creates a version missing the first's tool. Worth acknowledging as a known limitation or describing retry/conflict detection (e.g., conditional on the fetched version number).

Stale pending-record accumulation: Records are cleared by connection add or delete, but if a user runs create and never follows up (or the endpoint changes), pending records accumulate forever. Consider a TTL (e.g., 30 days) or a toolbox list --cleanup-stale hint.

Built-in tool interaction with last-tool guard: connection remove rejects when tools[] would be empty. But if removing the last connection-backed tool leaves only built-ins (which are "carried through unchanged"), is the result considered empty? The guard should clarify whether empty means zero total entries or zero connection-backed entries.

show on pending toolbox: create records locally, but show does GET /toolboxes/{name} which will 404 for a pending-only toolbox. Should show fall back to displaying the pending record metadata, or explicitly error with "toolbox is pending, run connection add first"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Add ai.toolbox commands for CRUD on Toolboxes in Foundry Project

4 participants