Skip to content

Extension: implement VS Code telemetry signals (#17721)#17723

Merged
adamint merged 10 commits into
microsoft:mainfrom
adamint:feat/vscode-extension-telemetry-17721
Jun 1, 2026
Merged

Extension: implement VS Code telemetry signals (#17721)#17723
adamint merged 10 commits into
microsoft:mainfrom
adamint:feat/vscode-extension-telemetry-17721

Conversation

@adamint
Copy link
Copy Markdown
Member

@adamint adamint commented May 30, 2026

Description

Implements the VS Code extension telemetry signals from #17721. Adds five categories of events plus a foundation for outcome/duration tracking, with a documented data-collection notice in the extension README. CLI-side telemetry signals listed in #17721 are intentionally out of scope for this PR and tracked separately.

Fixes #17721

What's included

Signal Event(s)
S1 Meaningful engagement engagement/active — once per session on first AppHost-detected / command / debug session trigger; common property apphost_languages is published once known
S2 Running AppHosts view shown runningAppHostsView/shown — 1s-debounced hidden→visible transitions; view_mode property + running_apphosts / total_resources measurements
S3 Command palette commands command/invoked — both tryExecuteCommand and bypass paths (~30 commands across editor / tree / code lens / walkthrough) routed through the new registerInstrumentedCommand helper with source, outcome (success / error / canceled), and duration_ms
S4 Debug / launch lifecycle debug/runSession/{start,end} (per child resource) and debug/appHost/{start,end} (per AppHost). End event aggregates total_child_sessions, distinct_resource_type_count + names, ended_with_error, and duration_ms so we can see multi-language debugging at a glance
S5 Dashboard telemetry passthrough Full /telemetry/* contract bridged to the extension's reporter, replacing the hardcoded /telemetry/enabled = false stub. Honors the global VS Code telemetry setting. Bearer-only auth middleware for routes the dashboard hits (it does not send the DCP instance-id header). Per-operation correlation Map with a 1h abandoned-op TTL.

Foundation work

  • Singleton telemetry reporter accessor + sendTelemetryErrorEvent helper
  • withCommandTelemetry wrapper records outcome and duration_ms; cancellation-aware so cancel errors don't show error toasts
  • Common-properties merging (apphost_languages, apphost_present) attached automatically to every event
  • summarizeAppHostLanguages(candidates) → csharp | typescript | polyglot | unknown | none classifier
  • classifyAppHostPath(path) → csharp | typescript | unknown for the debug-session path
  • Test seams (__setReporterForTests, __resetCommonPropertiesForTests) for unit tests

Privacy

The README's new "Data and telemetry" section enumerates every event category and explicitly documents that source code, file paths, env values, and resource names are not reported. All events are gated on VS Code's global telemetry.telemetryLevel setting via @vscode/extension-telemetry.

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes — 13 new unit tests for the telemetry + appHostLanguage helpers; full extension suite (yarn run test) passes with 635 tests
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No — TypeScript extension only; no .NET public API surface affected.
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No — happy to walk through if a reviewer wants. Notes below.
    • No

Security notes

  • The new requireBearerOnly middleware on /telemetry/* routes still requires the DCP bearer token; it only relaxes the DCP-instance-id header requirement because the dashboard's HttpClient does not send it. The token is the same shared secret already used to gate every other DCP route.
  • Dashboard telemetry payloads are forwarded through @vscode/extension-telemetry, which respects the global VS Code telemetry.telemetryLevel setting. The extension never logs or persists payloads outside the existing extension log output channel.
  • Property flattening strips nested objects via JSON.stringify, but no per-event allow-list is enforced. Future work could add one if we observe the dashboard sending unexpectedly large or sensitive properties; current dashboard surface area is limited to its own structured AspireTelemetryProperty shape.

…board signals

Implements the VS-Code-side asks from microsoft#17721 (CLI signals tracked separately):

Foundation
- Singleton telemetry reporter accessor + sendTelemetryErrorEvent helper
- withCommandTelemetry wrapper records outcome (success/error/canceled) and duration_ms
- Common-properties helpers publish apphost_languages + apphost_present once known
- Test seams (__setReporterForTests / __resetCommonPropertiesForTests) for unit tests

Command coverage (S3)
- tryExecuteCommand routed through withCommandTelemetry
- registerInstrumentedCommand wraps the bypass paths (runAppHost / debugAppHost editor
  and tree commands, code lens commands, walkthrough commands, restore, etc.)
- Each event carries a source dimension (command_palette / editor / tree / codelens
  / walkthrough) for downstream cohort analysis

Meaningful engagement (S1)
- MeaningfulEngagementReporter fires engagement/active at most once per session on
  the first of: AppHost present in workspace, any command run, any debug session
- Publishes apphost_languages summary via summarizeAppHostLanguages classifier

Running AppHosts view (S2)
- AppHostsViewTelemetry subscribes to tree-view visibility with a 1s debounce
- Emits runningAppHostsView/shown with view_mode + running_apphosts / total_resources

Debug lifecycle (S4)
- debug/runSession/start on PUT /run_session in AspireDcpServer
- debug/runSession/end from sessionTerminated (and the launch-failure catch block)
  with exit_code_bucket + duration_ms
- debug/appHost/start emitted from AspireDebugSession on the launch DAP request
- debug/appHost/end emitted from AspireDebugSession.dispose() with aggregated stats
  (total_child_sessions, distinct_resource_type_count + names, ended_with_error,
  duration_ms). DCP server tracks aggregates via takeDebugSessionAggregateStats.

Dashboard passthrough (S5)
- Replaces the hardcoded GET /telemetry/enabled stub with the full dashboard
  telemetry contract (start/startOperation/endOperation/startUserTask/endUserTask/
  operation/userTask/fault/asset/property/recurringProperty/commandLineFlags)
- Property flattening of AspireTelemetryProperty into TelemetryReporter's flat
  key/value model, documented encoding rules
- Bearer-only auth middleware for routes the dashboard reaches (it does not send
  the DCP instance-id header)
- Per-operation correlation Map with a 1h TTL for abandoned operations
- /telemetry/enabled now honors the global VS Code telemetry setting

README data-collection notice documents every category of event the extension now
emits and confirms we do not report source, paths, env values, or resource names.

Tests:
- 13 new unit tests for telemetry + appHostLanguage helpers
- Full extension test suite passes (635 passing)

Refs microsoft#17721

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 30, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17723

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17723"

adamint and others added 2 commits May 30, 2026 02:07
Fixes from a parallel pass by Opus 4.7, Opus 4.8, and GPT 5.5 reviewers:

- AspireDcpServer: extract emitRunSessionFailureEnd helper so the three
  early-return failure paths in PUT /run_session (unsupported launch
  config, missing debug session, debugger did not start) emit
  debug/runSession/end and update the AppHost aggregate. Previously a
  debug/runSession/start without a matching /end was emitted and
  anyNonZeroExit was not flipped, so debug/appHost/end under-reported
  failures.
- AspireDcpServer: route nonzero debug/runSession/end through
  sendTelemetryErrorEvent (matches the catch-block path) so faults get
  the reporter's stricter scrubbing.
- DashboardTelemetryPassthrough: dispatch flattenProperties on
  PropertyType.Metric (the C# dashboard sends metric values as
  invariant-culture strings) instead of typeof === 'number'; drop
  PropertyType.Pii properties defensively; stringify other numbers
  instead of promoting to measurements.
- AspireDebugSession.dispose: snapshot start-event metadata, run
  disposables before emitting telemetry, and defer debug/appHost/end
  via setTimeout(500ms) so child sessionTerminated notifications have
  a chance to flow through the adapter tracker before the aggregate
  is taken.
- README: scope the absolute privacy claim to events the extension
  originates, and call out the Pii-drop behavior on dashboard
  passthrough.
- Add dashboardTelemetryPassthrough.test.ts (8 tests) covering
  Metric-string parsing, Pii drop, numeric non-promotion, and complex
  object stringification.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nsion

Wire-format / correctness:
- Dashboard sends JSON via HttpClient.PostAsJsonAsync which since .NET 9
  uses JsonSerializerOptions.Web (camelCase property naming, numeric enum
  serialization). The TS interfaces in DashboardTelemetryPassthrough were
  PascalCase, so every dashboard request bound to undefined fields:
  - flattenProperties dropped all dashboard properties (Value/PropertyType
    vs value/propertyType).
  - PostOperationRequest.Result was undefined, so failure routing to
    sendTelemetryErrorEvent never fired.
  - All other route handlers (operation, userTask, fault, asset, property,
    recurringProperty, commandLineFlags) read undefined fields.
  Rewrite the interface definitions to camelCase and add numeric-enum
  label mappers (telemetryResultLabel, faultSeverityLabel, isFailureResult).
- Route Failure/UserFault results from PostOperation/PostUserTask/endOperation
  through sendTelemetryErrorEvent so they participate in the reporter's
  stricter scrubbing pass.

Privacy:
- Add scrubFreeformDiagnosticText to truncate forwarded dashboard exception
  messages and fault descriptions to 1024 chars (defense-in-depth on top of
  the reporter's pattern-based PII scrubbing).
- Clamp resource_type / distinct_resource_types telemetry to the supported
  resource-type set; unsupported launch-configuration types are now reported
  as 'unsupported' instead of leaking the raw CLI-supplied type string.
- Validate the launch.json command against the AspireCommandType allowlist
  ('run' | 'deploy' | 'publish' | 'do') before emitting it; unknown values
  collapse to 'other'.
- Tighten README to be explicit that forwarded dashboard exception messages
  may contain user-controlled fragments and document the new resource_type
  clamping.

Performance:
- Fix _handleStart abandonment-timer closure leak: project eventName out
  of payload before capturing it in the 1h setTimeout closure so the
  arbitrarily large payload.settings.startEventProperties bag is not
  retained for the entire TTL.

AppHost classification:
- Add classifyAppHostDirectory which synchronously enumerates marker files
  (apphost.{ts,mts,cts,js,mjs,cjs}, AppHost.cs, *.csproj). Previously,
  every directory-style AppHost was classified as 'unknown', which would
  miscategorize the entire TypeScript AppHost cohort once it landed.
- AspireDebugSession now uses classifyAppHostDirectory when appHostPath
  refers to a directory.

Security:
- Fix loose bearer-prefix check in requireHeaders/requireBearerOnly.
  auth.split('Bearer ').length === 2 accepted other schemes that happened
  to contain 'Bearer ' as a substring (e.g. 'X-Bearer <token>'). Replace
  with startsWith('Bearer ') + slice and factor the two middlewares onto
  a shared validateBearerToken helper.

Tests:
- Update dashboardTelemetryPassthrough.test.ts fixtures from PascalCase to
  camelCase and add coverage for enum label mappers and scrub helper.
- Add classifyAppHostDirectory tests covering csharp / typescript /
  unknown / mixed cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
GitHub was asked to rerun all failed jobs for that attempt, and the rerun is being tracked in the rerun attempt.
The job links below point to the failed attempt jobs that matched the retry-safe transient failure rules.

@adamint adamint force-pushed the feat/vscode-extension-telemetry-17721 branch from e4745da to 6e2a01a Compare May 30, 2026 07:52
@adamint adamint force-pushed the feat/vscode-extension-telemetry-17721 branch from 6e2a01a to 22af314 Compare May 30, 2026 08:04
The Aspire dashboard sends arbitrary event names and arbitrary property
keys over HTTP. Previously the passthrough forwarded those names
verbatim, which meant every new dashboard event or property silently
added rows to the (EntityName, PropertyName) classification catalog and
required someone to chase down the new entries to classify them.

This change:

- Introduces src/utils/telemetryRegistry.ts: a per-event schema mapping
  every event name to its allowed properties + measurements. The schema
  mirrors the catalog's pair-based model so every "row" the extension
  is allowed to produce is enumerable in one file.
- Makes sendTelemetryEvent / sendTelemetryErrorEvent generic over the
  registered event names, so TypeScript rejects calls with unregistered
  event names or unregistered property/measurement keys at compile time.
- Restricts setCommonTelemetryProperties to the registered common-property
  keys, since every common property duplicates into a row per event.
- Replaces the dashboard passthrough's per-route emit-with-payload-name
  pattern with a fixed set of extension event names (dashboard/operation,
  dashboard/userTask, dashboard/fault, dashboard/asset,
  dashboard/scope/start, dashboard/scope/end, dashboard/property/set,
  dashboard/property/recurring, dashboard/commandLineFlags). The
  dashboard's original event name now lives in 'dashboard_event_name'.
- Replaces flattenProperties with bundleDashboardData, which JSON-encodes
  the per-event property/metric bag into stable 'dashboard_properties' /
  'dashboard_measurements' string fields, with size guardrails
  (100 entries / 8 KB per bundle, with a __truncated__ marker on overflow).
- Drops the getTelemetryReporter export to remove a bypass path that
  let callers skip the typed wrapper layer.
- Updates the README to describe the new bounded surface and points
  reviewers at the registry.

Tests: 664 passing (was 658 — 6 new bundleDashboardData / formatCorrelations
tests cover Pii drops, truncation by count, truncation by size,
independent property/measurement budgets, and correlation formatting).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
adamint and others added 2 commits May 30, 2026 14:28
Adversarial multi-model review of the VS Code dashboard telemetry
passthrough surfaced several ways user/workspace content or attacker
input could reach the telemetry channel despite the README's "no
resource names or workspace contents" guarantee.

Privacy hardening — drop free-form, dashboard-composed strings instead
of merely truncating them (truncation bounds volume, not sensitivity):
- bundleDashboardData now drops the Basic-tagged exception message and
  stack trace keys (Aspire.Dashboard.Exception.Message/StackTrace) which
  embed resource names and home-directory paths. Structured
  Exception.Type/RuntimeVersion are retained.
- /telemetry/fault no longer forwards `description` (the dashboard builds
  it as `${type}: ${ex.Message}`).
- /telemetry/endOperation no longer forwards `error_message` (the only
  producer passes a caught `ex.Message`, see DashboardCommandExecutor).
- /telemetry/operation and /telemetry/userTask no longer forward
  `result_summary` (latent free-form field at the network boundary).
- Removed the now-unused description/error_message/result_summary members
  from the telemetry registry so they can't be reintroduced accidentally.

Additional boundary hardening:
- formatFlagPrefixes strips flag values (keeps only the name before the
  first `=`, `:`, or whitespace) so `--token=secret` can't leak the value.
- formatCorrelations clamps each eventType/id (previously only count-capped).
- scrubFreeformDiagnosticText accepts `unknown` and coerces non-strings to
  '' so a malformed JSON body can't throw a TypeError -> Express 500.
- Telemetry `mode` is clamped to Debug/NoDebug/Unknown/other instead of
  forwarding the CLI-controlled value verbatim.

Updated the README data/telemetry section and added unit + route tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Final multi-model privacy/security review of the telemetry passthrough
surfaced several JSON-boundary hardening gaps. Fixes:

- Property routes now bundle the value under the real property name so the
  free-form drop-list and key clamping apply (they were bypassed by the
  synthetic 'value' wrapper key).
- asset_event_version, result/severity labels, and the Pii discriminator
  now coerce numeric input instead of interpolating/forwarding arbitrary
  strings or arrays verbatim.
- Truncation markers are reserved inside the cap so clamped values never
  exceed their documented maximum.
- formatCorrelations now bounds total serialized size, not just per-element
  and count caps.
- Dropped the console-logs resource-name key as defense-in-depth.
- Removed a dead isExtensionTelemetryEnabled import.

Added unit tests for each behavior; full extension suite passes (708).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@adamint adamint marked this pull request as ready for review May 30, 2026 18:46
Copilot AI review requested due to automatic review settings May 30, 2026 18:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the next round of telemetry instrumentation in the Aspire VS Code extension, including meaningful engagement, view visibility, command outcome/duration tracking, debug lifecycle signals, and a dashboard telemetry passthrough bridge (with an explicit privacy notice in the extension README).

Changes:

  • Adds a typed telemetry event/property registry and expands the telemetry utilities to support common properties plus command outcome/duration tracking.
  • Instruments extension commands, view visibility, meaningful engagement triggers, and debug session lifecycle events.
  • Implements the /telemetry/* HTTP contract for dashboard telemetry passthrough with bundling/scrubbing guardrails and adds extension unit/route-level tests.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
extension/src/views/AppHostsViewTelemetry.ts Emits runningAppHostsView/shown on debounced hidden→visible transitions with counts/measurements.
extension/src/utils/telemetryRegistry.ts Central registry of allowed event names and per-event property/measurement keys for classification control.
extension/src/utils/telemetry.ts Adds common-property merging, enablement gating, error event helper, and withCommandTelemetry outcome/duration wrapper.
extension/src/utils/meaningfulEngagement.ts Adds once-per-session engagement/active emission gated on AppHost detection/command/debug-session triggers.
extension/src/utils/appHostLanguage.ts Adds AppHost language-family classifiers for telemetry (workspace and debug-path variants).
extension/src/test/telemetry.test.ts Unit tests for telemetry utilities (common props merge, command outcome/duration, cancellation detection).
extension/src/test/dashboardTelemetryRoutes.test.ts Route-level integration tests for dashboard→extension passthrough normalization and privacy guarantees.
extension/src/test/dashboardTelemetryPassthrough.test.ts Unit tests for bundling/scrubbing/clamping and defensive parsing in passthrough helpers.
extension/src/test/appHostLanguage.test.ts Unit tests for AppHost language summary and path/directory classification.
extension/src/extension.ts Wires engagement reporter, adds command instrumentation wrapper, instruments many commands, and routes tryExecuteCommand through telemetry.
extension/src/debugger/AspireDebugSession.ts Emits debug/appHost/start and delayed debug/appHost/end summary using DCP aggregate stats.
extension/src/dcp/DashboardTelemetryPassthrough.ts Implements the full /telemetry/* dashboard contract, normalization, and privacy/scrubbing guardrails.
extension/src/dcp/AspireDcpServer.ts Adds dashboard telemetry routes, tightens bearer parsing, emits debug/runSession/*, and hardens IDs with crypto.randomBytes.
extension/README.md Documents telemetry collection categories and privacy posture/guarantees.

Comment thread extension/src/extension.ts
Comment thread extension/src/dcp/AspireDcpServer.ts Outdated
- tryExecuteCommand: when the CLI is unavailable and the user is redirected
  to install it, throw a cancellation so withCommandTelemetry records the
  invocation as 'canceled' instead of a false 'success'. The existing catch
  suppresses the error toast since the redirect already informed the user.
- requireBearerOnly (/telemetry/*): the 'missing' bearer failure now returns
  an Authorization-only message. Those routes intentionally do not require the
  DCP instance-id header, so the combined 'Authorization and ...DCP-Instance-ID
  headers are required' message was misleading. This branch is only reachable
  from requireBearerOnly (requireHeaders handles missing headers inline).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread extension/README.md Outdated
@adamint adamint requested a review from davidfowl May 31, 2026 07:28
adamint and others added 2 commits May 31, 2026 04:13
Address final review findings in the VS Code extension telemetry implementation by tightening dashboard passthrough privacy, binding DCP to localhost, preserving AppHost failure telemetry, and classifying user cancellations correctly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Require dashboard telemetry requests to carry a scoped DCP instance id, propagate that id from AppHost configuration into the dashboard sender, and register/prefix the extension telemetry catalog for VS Code visibility.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@adamint adamint merged commit 54ff1b9 into microsoft:main Jun 1, 2026
312 checks passed
@microsoft-github-policy-service microsoft-github-policy-service Bot added this to the 13.5 milestone Jun 1, 2026
pull Bot pushed a commit to tooniez/aspire that referenced this pull request Jun 2, 2026
Brings 43 release-branch commits forward onto main now that 13.4.0 has shipped.
This PR replaces the original automated merge (microsoft#17804) which had to be closed so
that conflict resolution and post-merge cleanups could be made on a non-protected
branch.

Conflict resolution summary (33 files):

* Equivalent backports (took main's commit identity): ChannelUpdateWorkflowTests,
  LoggingHelpersTests, the four extension test files, AspireEditorCommandProvider,
  appHostDiscovery.

* Release-only forwards (preserved): microsoft#17732 / microsoft#17756 Foundry hosted-agent protocol
  selection and cross-compute-environment endpoint references, microsoft#17573 stabilize
  PrebuiltAppHostServer staging globalPackagesFolder path, microsoft#17743 staging-identity
  CLI darc feed routing.

* Main-only forwards (preserved): microsoft#17506 Show discovered AppHosts in Aspire pane,
  microsoft#17547 Localize Aspire skills metadata errors, microsoft#17801 VS Code v1.12.0, microsoft#17297
  Aspire CLI npm package release integration, microsoft#17576 TerminalRun IAsyncDisposable,
  microsoft#17721 / microsoft#17723 VS Code telemetry, microsoft#17671 ATS baseline fix (re-applied manually
  on top of Foundry source taken from release).

* Hybrid (manually spliced): docs/contributing.md - kept main's restructured
  layout and inserted release's staging-validation paragraph; HostedAgentBuilder-
  Extension - took release base then re-applied microsoft#17671 asHostedAgent rename;
  UpdateCommandTests - took main and injected microsoft#17743's
  OverrideCliInformationalVersionConfigKey block.

Post-merge cleanups included in this PR:

* eng/Versions.props: revert StabilizePackageVersion to false (was flipped to
  true on release/13.4 by microsoft#17520 for shipping 13.4.0; main must stay in preview
  mode).

* .github/workflows/generate-api-diffs.yml: retarget back to main (was pointed
  at release/13.4 by microsoft#17696 release prep).

* .github/workflows/backmerge-release.yml: update from release/13.3 to
  release/13.4 (was stale - missed the 13.4 release-time bump).

* .github/workflows/milestone-assignment.yml: audited - already correct
  (main -> 13.5, release/13.4 -> 13.4.x); no change.

This merge commit must be preserved - do not squash on merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

VS Code extension: next round of telemetry signals (engagement, view, commands, debug, dashboard passthrough)

4 participants