[C#DK v2 Aspire integration] Expose apphost query API and aspire resource management API from aspire extension#17705
Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17705Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17705" |
| } | ||
|
|
||
| /** | ||
| * Performs a one-shot `aspire ps --resources` call and returns the results directly. |
There was a problem hiding this comment.
Replace with aspire ps --format json then aspire describe --follow --apphost <path2apphost> --format json?
adamint
left a comment
There was a problem hiding this comment.
Reviewed (as a WIP/draft) by three parallel agents focused on public-API design, CLI integration correctness, and event/dispose lifecycle. There's strong cross-agent convergence on three structural issues that need to be settled before this is non-draft: the API is built on the removed aspire ps --resources flag, the onDidChangeAppHosts contract the JSDoc promises is not what the implementation delivers, and the public methods silently swallow several distinct failure modes (CLI missing, wrong appHostPath, transient CLI error). All of these matter once C# Dev Kit codes against the surface — once shipped, breaking changes get painful fast.
Blockers / High
-
aspire ps --resourceswas removed frommainin PR #17479 —fetchAppHostsOncewill silently return[]for every consumer after rebase. This is what @davidfowl's "This API is gone." comment refers to. Verified:upstream/mainsrc/Aspire.Cli/Commands/PsCommand.csno longer hass_resourcesOption, andAppHostDisplayInfo.Resourceshas been removed. After rebase the CLI will emitUnrecognized command or argument '--resources'and exit non-zero;fetchAppHostsOnceswallows that intoresolve([])sogetRunningAppHosts()returns empty for every caller. Not a flag-rename fix: post-#17479 main fans resources in via_globalDescribeStreams(oneaspire describe --follow --apphost <path>per AppHost merged client-side). A correct one-shot needs either (a)aspire ps --format jsonfor the AppHost list + per-AppHost bounded describe, or (b) a newaspire describe --snapshotmode on the CLI side. (Inline comment.) -
getRunningAppHosts()silently swallows every failure mode → C# Dev Kit cannot tellaspire-cli-missingfromno-apphosts-running.fetchAppHostsOnce(AppHostDataRepository.ts:256–276) funnels non-zero exit, JSON parse failure, ANDspawn ENOENT(when the CLI is not installed) all intoresolve([]). Note:getAspireCliExecutablePath()does not throw on missing CLI —resolveCliPathreturns{ cliPath: 'aspire', available: false }and the literal'aspire'propagates through, so spawn fails withENOENTand the consumer sees a clean empty array. The polling code (_runPsCommand) correctly surfaces these via_setError; the public API regresses that behavior. C# Dev Kit needs to be able to surface "Install the Aspire CLI" diagnostics, not just render an empty list. (Inline comment.) -
onDidChangeAppHostsonly fires while the Aspire UI is active — broken contract for any consumer that doesn't open the Aspire panel. The JSDoc onapi.ts:28–31says "Event fired when AppHost or resource state changes." Both data sources backing the event (describe-follow and ps polling) are gated on_dataActive = _panelVisible || _appHostFileOpen(AppHostDataRepository.ts:297–331). For a C# Dev Kit consumer that activates with the Aspire panel hidden and no AppHost file open, the consumer subscribes, the user runs a session, resources move Starting → Running → Failed → Restarting over 60 seconds, and the consumer receives zero events because nothing is polling. Fix is one of: (a) start an API-owned polling source when ≥1 external listener is subscribed (forces_shouldPollto betrueregardless of UI state), or (b) update the JSDoc to say "best-effort; only fires while the Aspire view is active" and tell consumers to poll. Option (a) matches the JSDoc as written. (Inline comment.) -
stopResource('foo', '')silently triggers an interactive AppHost picker via the extension RPC channel. BothstopResourceandstartResourcedeclareappHostPath: stringas required, then useif (appHostPath) args.push('--apphost', ...)so empty string falls through. The CLI then calls_connectionResolver.ResolveConnectionAsync(null, ...)→PromptForAppHostSelectionAsyncwhen multiple AppHosts are running. Critically, these two methods do NOT passnoExtensionVariables: true(unlikefetchAppHostsOnce), soASPIRE_EXTENSION_PROMPT_ENABLED=trueplus the RPC endpoint/token are set in the child env. The CLI then issues the prompt over the extension's RPC channel → a QuickPick appears to the user out of nowhere, and the promise hangs until the user picks. C# Dev Kit's reasonable test ofawait api.stopResource('myapi', '')becomes "unexpected UI prompt to the user." Fix: validateappHostPathis non-empty and absolute at the API boundary (reject otherwise), AND passnoExtensionVariables: trueto all three CLI invocations so the CLI cannot prompt through the extension at all. (Inline comment.) -
stopResource/startResourcerejection drops stderr — consumer getsexited with code Nwith no diagnostic. Neither method setsstdoutCallbackorstderrCallback, so the CLI's actual error text ("resource 'foo' not found","command 'stop' is not available for resource type 'Container'","AppHost not running") is lost. Combined with overlapping exit codes fromCliExitCodes.FailedToFindProject,CliExitCodes.InvalidCommand, and the innerExecuteResourceCommandAsync, the consumer cannot tell what went wrong. Fix: buffer stderr/stdout and include them in the rejection. (Inline comment.)
Medium
-
noExtensionVariables: trueinconsistency.fetchAppHostsOncecorrectly passes it.stopResource/startResourcedo not, so the CLI inheritsASPIRE_EXTENSION_ENDPOINT,ASPIRE_EXTENSION_TOKEN,ASPIRE_EXTENSION_CERT,ASPIRE_EXTENSION_PROMPT_ENABLED=true,DEBUG_SESSION_PORT,DEBUG_SESSION_TOKEN,DEBUG_SESSION_SERVER_CERTIFICATE. These RPC channels were established for the run-debug flow that owns them; reusing them outside that flow is wrong — see #4 for the user-visible symptom. PassnoExtensionVariables: truefor all three CLI invocations. -
onDidChangeAppHostsover-fires per NDJSON line and on UI-state transitions → thundering CLI invocations._handleDescribeLinefires_onDidChangeDatafor every snapshot (a single resource emits ~10 snapshots in 5 seconds: Building → Starting → progress ticks → Running → probing → Healthy). With 10 resources, that's ~80 fires in 5 seconds. The emitter is also fired fromsetViewMode,_clearWorkspaceAppHost,_setError, and view-mode toggles — intentionally over-fired because the tree-view re-render is cheap. Each fire causes a programmatic consumer to callgetRunningAppHosts(), which spawns a freshaspire psprocess. 80 process spawns + 80 backchannel scans in 5 seconds — measurable CPU and CLI-startup latency while the AppHost itself is launching. Either coalesce/debounce (e.g., 250 ms), or expose a narrower event scoped to actual snapshot deltas (vscode.Event<{ appHostPath: string; changedResourceNames: readonly string[] }>matches VS Code Debug API conventions). -
No
CancellationToken/AbortSignal/ timeout on any method. A hung CLI (slow backchannel scan, wedged resource command) leaves the Promise pending forever. The polling code tracks processes in_psProcessesanddispose()terminates them; promises returned from the public API have no such escape valve, and a consumer cancellation cannot tear down the child. Add an optionalcancellationToken?: vscode.CancellationTokenper method; terminate the child on cancel and reject withvscode.CancellationError. Add a default ~30 s timeout with a typed timeout error. -
dcpServerInfois newly exported fromactivate(), leaking the DCP TLS token + certificate to any extension. Previously onlyrpcServerInfowas exposed.DcpServerConnectionInfois{ address, token, certificate }— any extension can nowvscode.extensions.getExtension('aspire-vscode').exports.dcpServerInfoand impersonate DCP against the running session. If C# Dev Kit needs it, scope and document the trust boundary; otherwise drop it. -
Spreading
...apiinto the activation return collides with the existing/future*ServerInfokeys.return { rpcServerInfo, dcpServerInfo, ...api }puts API methods in the same namespace as RPC bootstrap data, prevents semver versioning, and risks accidental property collisions. Prefer the Azure Functions extension's pattern (seeextension/src/debugger/languages/azureFunctions.ts:25):getApi(version: string): AspireExtensionApi | undefined, or at minimum nest underapi: {...}. -
ResourceInfois a lossy subset — silently dropsisInternal, health, exit code, commands.mapResourcefilters!u.isInternalunconditionally and exposes only{ name, url }for endpoints. Consumers doing service discovery cannot opt back in.healthStatus,healthReports,exitCode,commands,stateStylefromResourceJsonare all dropped — exactly the data a "resource management API" consumer needs to render status. Either exposeisInternaland let consumers filter, or take an options object (includeInternal?: boolean); surfacehealthStatus/exitCodeif resource management is the value prop. -
state: string | nullis opaque with no typed value set, no forward-compat story. Consumers will writeif (r.state === 'Running'). The CLI/hosting layer adds states over time; every new state silently breaks consumer logic. Export a union type with the(string & {})escape hatch ('Running' | 'Stopped' | 'Starting' | 'Stopping' | 'Failed' | 'Exited' | 'Finished' | 'Building' | (string & {})) and document that consumers must default-branch unknown values. -
No
restartResourceand no genericexecuteResourceCommand. The extension already implementsaspire-vscode.restartResourceandaspire-vscode.executeResourceCommand(seeextension/src/extension.ts:115, 117), and PR #17698 lands a custom-commands tree. Shipping start/stop now and adding the rest later is either two partial release cycles or a churning interface. Decide whetherexecuteResourceCommand(resourceName, commandName, args, appHostPath?)is the right primitive and have start/stop call it, or commit to the full triad now. -
No per-AppHost filter on
getRunningAppHosts(). Every call forks a child process and scans all AppHost backchannels. C# Dev Kit polling a single solution pays this cost on every tick. AddgetRunningAppHosts(appHostPath?: string)and pass through asaspire ps --apphost <path>. -
Positional args lock out future options without breaking changes.
stopResource(resourceName, appHostPath)cannot growforce,timeout,signal,cancellationTokenwithout overloads or a breaking signature change. Once C# Dev Kit codes against this, every option addition is breaking. Use an options bag:stopResource(resourceName: string, options?: { appHostPath?: string; cancellationToken?: vscode.CancellationToken; ... }). -
fetchAppHostsOncedesynchronizes from the polling cache → suppresses real change events. It runs an independentaspire ps --resourcesand never updates_appHosts. The polling change-detection atAppHostDataRepository.ts:668usesJSON.stringify(parsed) !== JSON.stringify(this._appHosts)— out-of-band readers leave the cached baseline inconsistent, so transient transitions (A: Running → A: Failed → A: Running between the consumer's one-shot call and the next poll) compare equal to cache and fire no event; the consumer was told nothing changed when in fact it oscillated. -
No event-ordering guarantee for
stop/startResource. Promise resolves when the CLI process exits, butonDidChangeAppHostsonly fires on the next polling tick (default 30 s — see_getPollingIntervalMs) or the next describe-stream snapshot. Consumer patternawait api.stopResource(...); const apps = await api.getRunningAppHosts();may still see Running. Either kick off a fresh fetch after a successful resource command (within a bounded window) or document the pattern.
Low
-
No lifecycle guard.
fetchAppHostsOncedoesn't check_disposed;stopResource/startResourcedon't track a disposed flag at all. After deactivation, stale handles continue spawning CLI processes.onDidChangeAppHostsis the now-disposedEventEmitter(dataRepository.dispose()callsthis._onDidChangeData.dispose()), so post-disposeevent(handler)may throw. Add an_disposedflag the API closes over; reject calls with a typed error post-dispose. -
No concurrency dedupe → N parallel callers spawn N processes. The polling path coalesces via
_fetchInProgress; the public API does not. Cache the in-flight Promise. -
Promise rejection semantics not documented. "Resolves when the CLI command completes" doesn't say what happens when the resource doesn't exist, is already in the target state, the apphost isn't running, the CLI is not installed, or two simultaneous
stopResourcecalls race. C# Dev Kit will discover these empirically. Spell them out per method now before consumers code against undocumented behavior.
Verified safe (not flagged)
- Argument injection via
resourceName:spawnCliProcessuseschild_process.spawn(command, args, { shell: false })(extension/src/debugger/languages/cli.ts:29–33) — shell metacharacters not interpreted. AresourceNamestarting with-is rejected byResourceCommand'sIsOptionLikeTokenvalidator (ResourceCommand.cs:408–411) with a clear error. appHostPathcontaining spaces/quotes: Safe — passed as a singleargselement withshell: false, no shell-quoting needed.aspire resource <name> start|stop --apphost <path>syntax: Verified to exist in both the PR branch andupstream/main(src/Aspire.Cli/Commands/ResourceCommand.cs;start/stopare ins_wellKnownCommands). Unlikeaspire ps --resources, this command will not break after rebase.JSON.parse(stdout)stdout/stderr mixing:PsCommanduses_interactionService.DisplayRawText(json, ConsoleOutput.Standard)for the JSON payload and routes progress throughShowStatusAsync(suppressed in non-interactive contexts). WithnoExtensionVariables: true, no RPC backchannel is established and Spectre's spinner is suppressed. Stdout should remain clean JSON.mapResourcecorrectness:appHostPathpropagated from parent (not re-extracted),projectPathextracted viaproperties?.['project.path']matches the canonicalKnownProperties.Path(src/Shared/Model/KnownProperties.cs:56). Optional-chain plus nullish-coalescing handles all ofproperties === null,properties === undefined, and missing/nullvalue.resolvedguard infetchAppHostsOnce(lines 250–275) correctly prevents double-resolution ifexitCallbackanderrorCallbackboth fire.onDidChangeAppHosts: dataRepository.onDidChangeDatais the public surface of avscode.EventEmitter, so consumer-side.dispose()of the subscription works correctly. The lifecycle issue is in #3/#7/#18, not in event-disposal semantics.
# Conflicts: # extension/src/extension.ts
Enable VS Code test hosts to acquire a scoped Aspire DCP lease for test-launched AppHosts. Lease-backed requests use per-session tokens, scoped DCP prefixes, awaitable cleanup, and DCP notification tracking without requiring a parent Aspire debug session. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Run lease notification cleanup again after stopping owned debug sessions so async termination notifications cannot remain queued after the lease token is revoked. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nd update API function signature
…session management
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a public Aspire extension API and test-run DCP session leasing, plus “one-shot” CLI queries to fetch running AppHosts/resources outside of the normal polling/view lifecycle.
Changes:
- Introduces a new exported API (
createAspireExtensionApi) with AppHost/resource operations and test-run session acquire/release. - Adds one-shot CLI execution helpers (with new CLI error types) and
fetchAppHostsOnce()inAppHostDataRepository. - Extends the DCP server to support lease-based authorization and lifecycle management for test-run sessions, with accompanying unit/e2e tests.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| extension/src/views/AppHostDataRepository.ts | Adds one-shot CLI JSON runner, one-shot resource describe, and new CLI error types. |
| extension/src/api.ts | Introduces the public extension API surface for other extensions (and tests). |
| extension/src/extension.ts | Switches exported API to the new public API and includes RPC server info in exports. |
| extension/src/dcp/TestRunSessionManager.ts | Implements lease acquisition, expiration, and request authorization for test runs. |
| extension/src/dcp/AspireDcpServer.ts | Adds lease-based auth path, run cleanup for leases, and debug-session creation without an Aspire debug session. |
| extension/src/debugger/languages/dotnet.ts | Makes debug session optional and changes logging fallback behavior. |
| extension/src/dcp/types.ts | Makes debug session optional and allows async stopSession(). |
| extension/src/test/appHostDataRepository.test.ts | Adds tests for one-shot fetching and CLI failure/timeout behavior. |
| extension/src/test/api.test.ts | Adds tests for the new exported API behaviors (resource commands and event exposure). |
| extension/src/test/testRunSessionManager.test.ts | Adds unit tests for lease acquisition/release/expiry and authorization behavior. |
| extension/src/test/rpc/e2eServerTests.test.ts | Adds e2e test verifying exported test-run session API availability. |
| extension/src/test/aspireDcpServer.test.ts | Adds end-to-end-ish server tests for lease-backed run sessions and websocket behavior. |
- Updated activate() to merge the E2E test API (createExtensionApi) with the public API (createAspireExtensionApi) so apiVersion, dcpServerInfo, logDirectory, __testOnlyRpcServerInfo, and other E2E fields are exposed. - Updated e2eServerTests assertion to check sensitive fields aren't leaked rather than asserting dcpServerInfo doesn't exist (which contradicted extensionApi.test.ts). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… and remove deprecated API
…nConnections - Wrap stopSession() calls in try/finally so runsBySession and testRunSessionLeaseIdByRunId are always cleaned up even if stopSession() throws. - Remove the early closeLeaseNotificationConnections call in releaseTestRunSession, keeping only the one after stop/cleanup completes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
|
adamint
left a comment
There was a problem hiding this comment.
High-confidence finding from review:
stopRunSession now uses try/finally, which guarantees the maps are cleared, but the loop still awaits each debugSession.stopSession() sequentially without isolating failures. If one session rejects, the loop exits, the finally deletes the run bookkeeping, and the remaining debug sessions in the array are never stopped. Please catch/log per session, or collect failures after attempting all stops, so one bad debug session cannot orphan the rest.
Checklist
<remarks />and<code />elements on your triple slash comments?C#DK integration: https://devdiv.visualstudio.com/DevDiv/_git/vs-green/pullrequest/741646