feat: launch Agent Inspector from azd ai agent run#8264
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR adds automatic Agent Inspector launch when running a local agent via azd ai agent run, with an opt-out flag, and introduces a standalone azure.ai.inspector extension that serves the Inspector SPA and proxies JSON-RPC/WebSocket + HTTP/SSE traffic to the local agent.
Changes:
- Add
--no-inspectortoazd ai agent runand launch Inspector via the azd workflow service using a silent mode. - Implement the Inspector extension server/runtime (embedded SPA, WS JSON-RPC bridge, HTTP fetch/invoke + SSE proxying) and add tests.
- Update extension packaging (version bump, metadata/README/changelog, build scripts, lint config) and include bundled SPA assets.
Reviewed changes
Copilot reviewed 34 out of 119 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| cli/azd/extensions/azure.ai.inspector/version.txt | Bumps inspector extension version. |
| cli/azd/extensions/azure.ai.inspector/main.go | Updates module import path for extension entrypoint. |
| cli/azd/extensions/azure.ai.inspector/internal/version/version.go | Adds build-time version/commit/date variables. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/server_test.go | Adds server routing test for SPA index fallback. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/server.go | Adds local HTTP server + SPA asset handler + WS endpoint. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/rpc.go | Implements JSON-RPC handling over WebSocket for the SPA. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/proxy_sse_test.go | Adds test for upstream SSE cancellation on WS close. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/proxy_sse.go | Implements SSE proxying and streaming to SPA (and optional sink). |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/proxy_fetch.go | Implements HTTP fetch/invoke proxying and response shaping. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/index.html | Adds embedded SPA entrypoint HTML. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/qwen-T0CAGeOv.svg | Adds bundled SPA asset. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/qwen-B0layTYq.svg | Adds bundled SPA asset. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/minimax20-MO7AnRq7.svg | Adds bundled SPA asset. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/minimax-1g4txH6T.svg | Adds bundled SPA asset. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/kimi_o_80-BtOGfzOQ.svg | Adds bundled SPA asset. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/kimi_o_20-DhxCfzjk.svg | Adds bundled SPA asset. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/hljs/github.css | Adds bundled SPA syntax highlighting theme. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/hljs/github-dark.css | Adds bundled SPA syntax highlighting theme. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/foundry-openai-tmS3rrL9.svg | Adds bundled SPA asset. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/foundry-openai-kEJt47xx.svg | Adds bundled SPA asset. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/fireworks-openai-OsxdyxhS.svg | Adds bundled SPA asset (composite icon). |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/fireworks-openai-Da_qp4ay.svg | Adds bundled SPA asset (composite icon). |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/fireworks-minimax-cFev7e0G.svg | Adds bundled SPA asset (composite icon). |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/fireworks-minimax-BCqYI7Go.svg | Adds bundled SPA asset (composite icon). |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/fireworks-kimi-BZ4xkefv.svg | Adds bundled SPA asset (composite icon). |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/fireworks-kimi-B-UniIay.svg | Adds bundled SPA asset (composite icon). |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/fireworks-deepseek-DDRLQatW.svg | Adds bundled SPA asset (composite icon). |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/fireworks-deepseek-BG-COsKx.svg | Adds bundled SPA asset (composite icon). |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets/assets/ai21-labs-D2iYCjBT.svg | Adds bundled SPA asset. |
| cli/azd/extensions/azure.ai.inspector/internal/inspector/assets.go | Embeds SPA assets into the extension binary. |
| cli/azd/extensions/azure.ai.inspector/internal/cmd/version.go | Switches version reporting to new internal/version package. |
| cli/azd/extensions/azure.ai.inspector/internal/cmd/sse_render.go | Adds SSE rendering for terminal mirroring. |
| cli/azd/extensions/azure.ai.inspector/internal/cmd/root_test.go | Adds tests ensuring launch is an explicit subcommand + metadata. |
| cli/azd/extensions/azure.ai.inspector/internal/cmd/root.go | Refactors root command, adds launch/listen/metadata, adds debug logging hook. |
| cli/azd/extensions/azure.ai.inspector/internal/cmd/metadata.go | Removes redundant metadata command wrapper. |
| cli/azd/extensions/azure.ai.inspector/internal/cmd/inspector.go | Adds azd ai inspector launch command implementation. |
| cli/azd/extensions/azure.ai.inspector/internal/cmd/debug.go | Adds debug logging routing to file (or discard). |
| cli/azd/extensions/azure.ai.inspector/internal/cmd/context.go | Removes context command (no longer part of extension surface). |
| cli/azd/extensions/azure.ai.inspector/go.sum | Updates module dependency checksums. |
| cli/azd/extensions/azure.ai.inspector/go.mod | Renames module path and updates dependencies for inspector runtime. |
| cli/azd/extensions/azure.ai.inspector/extension.yaml | Updates extension metadata, version, examples, and required azd version. |
| cli/azd/extensions/azure.ai.inspector/cspell.yaml | Adds dictionary words for extension-specific terms. |
| cli/azd/extensions/azure.ai.inspector/ci-build.ps1 | Updates build flags and linker vars to new version package + hardening. |
| cli/azd/extensions/azure.ai.inspector/build.sh | Updates linker vars path for version embedding. |
| cli/azd/extensions/azure.ai.inspector/build.ps1 | Updates linker vars path for version embedding. |
| cli/azd/extensions/azure.ai.inspector/README.md | Adds detailed usage and development docs. |
| cli/azd/extensions/azure.ai.inspector/CHANGELOG.md | Updates changelog for 0.1.0-preview. |
| cli/azd/extensions/azure.ai.inspector/AGENTS.md | Adds extension-specific conventions/instructions. |
| cli/azd/extensions/azure.ai.inspector/.golangci.yaml | Adds lint configuration for the extension module. |
| cli/azd/extensions/azure.ai.inspector/.gitattributes | Disables whitespace normalization for bundled SPA artifacts. |
| cli/azd/extensions/azure.ai.agents/internal/cmd/run_test.go | Adds tests for --no-inspector and workflow-based inspector launch. |
| cli/azd/extensions/azure.ai.agents/internal/cmd/run.go | Launches inspector by default after agent is ready; adds opt-out and install guidance. |
Comments suppressed due to low confidence (5)
cli/azd/extensions/azure.ai.inspector/internal/inspector/proxy_sse.go:1
- This code emits user-facing output directly to stdout/stderr (
printUserInput,fmt.Printf,fmt.Fprintln) even when the inspector is launched with--silentfromazd ai agent run, which contradicts the PR’s goal to suppress inspector output. Additionally, the error message hardcodesPOSTeven though the request method can vary. Route all output through the configurable logger/SSESink (or gate it on a verbosity/debug flag), and construct the error string using the resolved request method.
cli/azd/extensions/azure.ai.inspector/internal/inspector/proxy_sse.go:1 - This code emits user-facing output directly to stdout/stderr (
printUserInput,fmt.Printf,fmt.Fprintln) even when the inspector is launched with--silentfromazd ai agent run, which contradicts the PR’s goal to suppress inspector output. Additionally, the error message hardcodesPOSTeven though the request method can vary. Route all output through the configurable logger/SSESink (or gate it on a verbosity/debug flag), and construct the error string using the resolved request method.
cli/azd/extensions/azure.ai.inspector/internal/inspector/proxy_fetch.go:1 - Echoing raw user input directly to stderr can leak potentially sensitive content into terminals/log capture (and currently bypasses
--silent). Consider removing this entirely or only emitting it via the logger under an explicit debug/verbose flag so it is opt-in.
cli/azd/extensions/azure.ai.inspector/internal/inspector/server.go:1 - The SPA fallback currently serves
index.htmlfor anyfs.Staterror, not just missing files. That can mask real problems (e.g., permission/corrupt embed) and turn them into misleading 200 responses. Prefer falling back only on not-exist errors (e.g.,errors.Is(err, fs.ErrNotExist)), and return a 500 for other errors.
cli/azd/extensions/azure.ai.inspector/internal/inspector/rpc.go:1 - Spawning an unbounded goroutine per inbound WS message can create avoidable memory/CPU pressure if the client misbehaves (or if the UI sends bursts). Consider handling non-streaming methods inline, and only offloading streaming-related methods; or introduce a bounded worker/semaphore to limit concurrent handlers.
wbreza
left a comment
There was a problem hiding this comment.
Thanks for this — the Inspector auto-launch is a great UX. I've focused this review on the new attack surface, concurrency, and the default-behavior change. Findings are grouped by priority.
Existing Copilot review findings (proxy_sse stdout bypass, proxy_fetch user-input echo, server.go SPA fallback, rpc.go unbounded goroutine, debug.go log file leak, inspector.go scanner.Err) still apply at HEAD — please resolve those alongside the items below; several chain together.
🔴 Blockers
1. Local SSRF via unauthenticated WS + arbitrary-URL HTTP proxy
internal/inspector/server.go (handleWS) + proxy_fetch.go / proxy_sse.go
/agentdev/ws/rpc is exposed with no token / no handshake — only an Origin check (bypassable, see #2). Once connected, webviewProxy/fetch, webviewProxy/invoke, and webviewProxy/fetchSSE accept a client-supplied URL and forward it via http.NewRequest with no scheme/host validation.
Any local process (poisoned npm postinstall, malicious VS Code extension, a browser tab once #2 is exploited) can:
- Exfiltrate IMDS managed-identity tokens (
http://169.254.169.254/metadata/identity/...) - Read intranet services / loopback dev databases / other CLIs' RPC ports
- POST arbitrary payloads to the local agent or any other localhost service
Auto-launching this from azd ai agent run by default makes the dev-machine surface materially larger.
Suggested fix:
- Mint a per-session random token on
Server.Start, embed it inindex.html, require it on WS upgrade (Sec-WebSocket-Protocolor query param). Reject upgrades without it. - Constrain proxy URLs:
scheme in {http, https}, host resolves to127.0.0.1/::1/localhost, ideallyport == cfg.AgentPort.
2. DNS rebinding bypasses CheckOrigin
internal/inspector/server.go:62-68
CheckOrigin: func(r *http.Request) bool {
origin := r.Header.Get("Origin")
if origin == "" { return true }
return origin == "http://"+r.Host || origin == "https://"+r.Host
}r.Host is attacker-controlled. Classic DNS-rebinding: developer visits evil.example → JS rebinds the host to 127.0.0.1 → browser sends Host: evil.example:8087, Origin: http://evil.example, check passes. Binding to 127.0.0.1 does not prevent this — the browser resolves locally itself.
Combined with #1, any browser visit to a malicious page while Inspector is running becomes click-zero exploitation for the full inspector lifetime.
Suggested fix: Validate r.Host against an explicit allowlist (127.0.0.1:<port>, [::1]:<port>, localhost:<port>); reject empty Origin on WS upgrades; ideally also gate on the session token from #1.
3. Nil-map panic race: registerStream after cleanup()
internal/inspector/rpc.go:94, 295 ↔ proxy_sse.go:46, proxy_fetch.go:138
handleWS dispatches each message in a new goroutine (go sess.handleMessage(&msg)). On WS close, defer sess.cleanup() sets s.streams = nil. An already-scheduled webviewProxy/fetchSSE or proxyInvoke goroutine then calls s.registerStream(id, cancel) → s.streams[id] = cancel on a nil map → process-wide panic. Reproducible on any disconnect-during-request scenario.
Suggested fix:
func (s *rpcSession) registerStream(id string, cancel context.CancelFunc) {
s.streamsMu.Lock()
defer s.streamsMu.Unlock()
if s.closed { cancel(); return }
s.streams[id] = cancel
}…and in cleanup(): clear(s.streams) + s.closed = true under the same mutex.
4. No recover in per-message goroutines
internal/inspector/rpc.go:94
go sess.handleMessage(&msg) runs against client-controlled JSON; any nil-deref / OOB in any handler crashes the entire inspector process — and in auto-launch mode also propagates back to the parent azd ai agent run.
Suggested fix: Wrap the goroutine body in defer func() { if r := recover(); r != nil { /* log + send RPC error if msg.ID != nil */ } }().
🟠 Major
- Empty
Originaccepted unconditionally (server.go:64-66) —if origin == "" { return true }lets any non-browser local process reach #1's SSRF without needing the rebinding chain. openUrlInBrowseris an unauth'd URI-handler launcher (rpc.go:186-201) — Any WS client can drivebrowser.OpenURLwith arbitrary schemes (ms-msdt:,file:,vscode:, etc.). Pairs poorly with the Follina-class URI-handler vulnerabilities that ship regularly. Allowlisthttp/httpsonly; reject embedded credentials.- Unbounded WS frame size & no deadlines → memory DoS (
rpc.go:59-95) — Noconn.SetReadLimit, noSetReadDeadline, no ping/pong. A single 10 GB frame OOMs the process. Setconn.SetReadLimit(1<<20), periodic ping/pong, cap concurrent sessions. - Body logging bypasses
--silentvias.logger.Printf(proxy_fetch.go:111, 152) —s.logger.Printf("invoke ... body: %s", ..., p.Body)is not gated bycfg.Silent. With auto-launch, every prompt + every model response lands on the parent's stderr. (Distinct from Copilot'sfmt.Fprintlnfinding inproxy_sse.go— this is the structured logger path.) WhenSilent, log only metadata (requestID, method, status, length). - Auto-launched inspector lifetime not bound to parent process (
extensions/azure.ai.agents/internal/cmd/run.go:258-326+inspector.go:124-132) —workflow.Run(ctx, ...)spawns inspector with no PID capture, no explicitKilloncancel(). Please verify: afterCtrl+Conazd ai agent run, doesnetstat -ano | findstr :8087show port 8087 still bound? If yes, inspector keeps proxying with no UI, subsequent runs collide on the port, and a stale proxy stays reachable. Capture child PID and kill on cleanup, or pass--parent-pidfor self-termination. streamSSELinesSSESink deadlock / goroutine leak (proxy_sse.go:116-165↔inspector.go:141 injectSSEEvents) —readSSEStreamreturns early onresponse.completed, dropping the inner pipe reader;injectSSEEventsblocks onFprintlninto an unbuffered pipe with no reader → goroutine + HTTP body pinned untilstreamCtxis cancelled. Happens on normal completion when trailing events arrive. Close the sink writer on completion, or makeinjectSSEEventsctx-aware.proxyInvokestreaming branch can leak response body on panic (proxy_fetch.go:137-145) —go s.pumpSSE(p.RequestID, resp, true)has no top-leveldefer resp.Body.Close(); body closure depends on the cancel-driven forcing goroutine that may not fire on panic / early return. Adddefer resp.Body.Close()at top ofpumpSSE(idempotent with existing path).- Default-behavior UX change is ungated and undocumented (
extensions/azure.ai.agents/internal/cmd/run.go+azure.ai.agents/README.md) — Auto-launching Inspector is a material UX shift with no env-var/global opt-out (only per-invocation--no-inspector), no Preview gating perdocs/reference/feature-status.md, no README update on the agents extension explaining the new default /--no-inspector/ headless behavior, and no telemetry distinguishing auto-launch attempts/successes/failures. CI and headless dev environments will see surprising behavior. Add anAZD_AI_AGENT_AUTO_INSPECTOR=false(or similar) global opt-out, document the new default, mark as Preview, emit telemetry. - No Content-Security-Policy or hardening headers on the served SPA (
server.go:129-137) — OnlyContent-Type+Cache-Controlset. Given the 2.6 MB upstream JS bundle, a supply-chain compromise or XSS gets full origin powers. AddContent-Security-Policy: default-src 'self'; connect-src 'self' ws://localhost:<port>,X-Content-Type-Options: nosniff,X-Frame-Options: DENY,Referrer-Policy: no-referrer. - Missing OSS attribution for embedded third-party SPA bundle — Embeds ~2.6 MB from
qidon/tryinspector(KaTeX fonts, highlight.js themes, etc.) with noNOTICE/THIRD_PARTY_NOTICES/ LICENSE references. Microsoft OSS policy generally requires attribution for redistributed third-party code, and incompatible licenses (GPL/AGPL) would block release. Please confirm upstream license compatibility and add the appropriate notices tocli/azd/extensions/azure.ai.inspector/. - No integration test that Inspector failure doesn't break agent run (
extensions/azure.ai.agents/internal/cmd/run_test.go) — The core safety property of the new default-on behavior — "inspector launch failure must not crash agent run" — has no test. Mock the workflow client to return RPC errors and panics; assert the agent continues. Also add an explicit test that--no-inspectorskips the workflow call entirely.
🟡 Minor
isInspectorExtensionMissingMessagebrittle string match (run.go:369-373) — Substring-matches"unknown command"/"inspector"/"unknown flag: --port". If azd's workflow runner changes wording, install guidance silently regresses to a generic warning. Prefer a typed sentinel/status code, or rely on the existing pre-check.- Windows Ctrl+C kills agent child ungracefully (
run.go:203-245) —exec.CommandContexton Windows =TerminateProcess; no chance to flush OTel/logs. UseCREATE_NEW_PROCESS_GROUP+GenerateConsoleCtrlEvent(CTRL_BREAK_EVENT)with grace timeout. loadAzdEnvironmenterror silently swallowed (run.go:177-182) — User gets confusing downstream "endpoint is empty" errors. Mirror the warning pattern used right below forresolveConnectionCredentials.- No size limit on error/buffered
io.ReadAll(resp.Body)(proxy_sse.go:~76,proxy_fetch.go:~55) — Malicious/buggy upstream can OOM Inspector. Useio.LimitReader(4 KB for error bodies, 16 MB for buffered). streamHTTPClienthas no response-header timeout (proxy_fetch.go:58) — Same client is used forproxyInvokebefore content-type is known; a hung non-SSE upstream blocks the goroutine. SetTransport.ResponseHeaderTimeout: 30s.apim-request-idto stdout in standalone mode (proxy_sse.go:75-77) —fmt.Printf("Trace ID: %s\n", ...)writes to stdout; breaks| jq/ stdout-as-data flows. Use stderr..golangci.yamlline-length 220 vs azd core 125 — Inherit from core or use file-scoped//nolint:lllfor justified long lines (mostly the embedded asset SHA-stamped paths).runInspectorbrowser-open goroutine leak onStartfailure (inspector.go:124-132) — Goroutine blocks on<-readyforever ifStartfails before binding. Closereadyin error path, or select onctx.Done().- Defense-in-depth on
assetsHandlerpath handling (server.go:139-150) —http.ServeMuxcleans paths today, but a future re-mount withStripPrefixcould bypass that. Explicitpath.Clean+ escape rejection. - Test coverage gaps — Missing tests for: SSE upstream 4xx/5xx + read-error paths, malformed SSE lines, SPA fallback for real
fs.Staterrors vsErrNotExist, JSON-RPC error response structure, very longdata:lines (>64 KB scanner buffer),--no-inspectorskips workflow call. - Test flakiness —
time.Sleep/ hardcoded timeouts inproxy_sse_test.go,server_test.go— replace with synchronization primitives ort.Context()deadlines. - Install warning has no rate-limit (
run.go ~200) — Prints on every run if extension absent. Consider a once-per-day marker file. - CHANGELOG entry minimal — Agents extension uses richer entries linking PR/feature.
requiredAzdVersionaccuracy (extension.yaml:8) — Verify>1.23.13matches the minimum azd version supportingworkflow.Runfor extensions.- AGENTS.md error-handling guidance gap (
extensions/azure.ai.inspector/AGENTS.md:51) — Clarify this extension uses plain wrapped errors (noexterrors), unlike the agents extension.
⚪ Nits
srv.URL()returnshttp://localhost:%dwhile listener binds127.0.0.1only (server.go:73-75, 85)cspell.yaml— comment whyazureaiinspector(Go module camelCase) is allowedbuild.sh:12— comment casing vs actual$EXTENSION_ID_SAFEdebug.golog file close error swallowedroot_test.go findCommandno recursion depth guardrun_test.go appendPortEnvVarsrelies on slice index ordering
Happy to discuss any of the above — particularly grateful for the contribution and the careful refactor of the extension internals. The blockers are all addressable with localized changes (session token + URL host check + nil-map fix + recover).
trangevi
left a comment
There was a problem hiding this comment.
There is a \azure-dev.github\workflows\approval-ext-azure-ai-inspector.yml file which controls a github pipeline to block PRs on a separate approval list from the standard codeowners file. This is intended to still get helpful reviews from the broader AZD group (in terms of sticking to AZD standards, GO best practices, etc.) while still blocking the PR on a smaller set of people who know the proper code flow for each extension. We should update that file with folks on your team for proper ownership of this new extension.
In .github/workflows/approval-ext-azure-ai-inspector.yml, add a REQUIRED_APPROVERS environment variable to the step:
- name: Check for required team approval
uses: actions/github-script@v9
env:
EXTENSION_PATH: "cli/azd/extensions/azure.ai.inspector/"
WORKFLOW_PATH: ".github/workflows/approval-ext-azure-ai-inspector.yml"
OVERRIDE_COMMAND: "/inspector-extension-approval override"
REQUIRED_APPROVERS: '["user1", "user2", "user3"]' # ← add this
with:
script: |
const script = require('./.github/scripts/pr-approval-foundry-extensions-shared.js');
await script({ github, context, core });
The shared script (line 31-33) checks for this env var and uses it instead of the default list.
Thanks for the detailed review. I went through the blocker, major, minor, and nit items and made targeted changes based on the threat model for this feature. Agent Inspector is a local developer UI served on loopback and launched from For items implemented exactly as requested, I marked them as resolved. For items that are partial or intentionally left out of this PR, I included the reason. Blockers
Major
Minor
Nits
|
Added "anchenyi", "XiaofuHuang", "swatDong" in the REQUIRED_APPROVERS list |
Summary
azd ai agent runnow automatically launches Agent Inspector for the local agent run. Users can opt out with--no-inspector.Changes
--no-inspectortoazd ai agent run.azd ai agent run, so the terminal keeps showing the agent run output.azure.ai.inspectoris not installed:azd extension install azure.ai.inspectorazd ai inspectoras a command group; Inspector only starts fromazd ai inspector launch.Inspector Web Demo