feat: add telemetry instrumentation for Copilot agent flows#7199
feat: add telemetry instrumentation for Copilot agent flows#7199wbreza merged 2 commits intoAzure:mainfrom
Conversation
|
Closing to re-open with correct base. This PR depends on #7172 and should be reviewed/merged after it. |
There was a problem hiding this comment.
Pull request overview
This PR adds OpenTelemetry instrumentation and new usage attributes around Copilot agent lifecycle and per-message processing, and extends the gRPC surface area to expose Copilot session/message/metrics/file-change APIs to extension consumers.
Changes:
- Add Copilot tracing fields + spans/usage attributes for initialization, session lifecycle, per-message usage, and consent counts.
- Add gRPC CopilotService protobuf + generated clients/servers and wire it into the azd gRPC server/client.
- Add headless-mode support (collector + permission auto-approval), plus file-change tracking surfaced via agent metrics and gRPC.
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| cli/azd/pkg/watch/watch.go | Adds GetFileChanges() + typed FileChange/FileChanges formatting used by agent metrics and gRPC responses. |
| cli/azd/pkg/watch/watch_test.go | Adds tests for file change tracking via watcher. |
| cli/azd/pkg/azdext/copilot.pb.go | Generated protobuf types for CopilotService messages (sessions, message, usage, file changes). |
| cli/azd/pkg/azdext/copilot_grpc.pb.go | Generated gRPC stubs for CopilotService. |
| cli/azd/pkg/azdext/azd_client.go | Adds AzdClient.Copilot() accessor for the new gRPC service. |
| cli/azd/internal/tracing/fields/fields.go | Adds new AttributeKeys for copilot session/init/message/mode/consent usage fields. |
| cli/azd/internal/grpcserver/server.go | Registers CopilotService with the gRPC server. |
| cli/azd/internal/grpcserver/server_test.go | Updates server tests to include CopilotService server wiring. |
| cli/azd/internal/grpcserver/prompt_service_test.go | Updates prompt service test server setup to include CopilotService. |
| cli/azd/internal/grpcserver/copilot_service.go | Implements CopilotService gRPC server routing to agents, plus usage/file-change conversion. |
| cli/azd/internal/grpcserver/copilot_service_test.go | Adds unit tests for CopilotService session/message/metrics/file-change/stop behaviors. |
| cli/azd/internal/agent/types.go | Adds Agent/AgentFactory interfaces, AgentMetrics, usage formatting via String(), and per-turn file changes. |
| cli/azd/internal/agent/types_test.go | Updates tests for renamed usage formatting (String() instead of Format()). |
| cli/azd/internal/agent/headless_collector.go | Adds headless collector for SDK events and usage metrics aggregation. |
| cli/azd/internal/agent/headless_collector_test.go | Adds unit tests for headless collector usage accumulation and idle signaling. |
| cli/azd/internal/agent/copilot_agent_factory.go | Returns AgentFactory interface and Create() now returns Agent. |
| cli/azd/internal/agent/copilot_agent.go | Adds spans/attributes, headless send path, cumulative metrics + file changes, consent counters, and session-cancellation detachment. |
| cli/azd/internal/agent/copilot/cli.go | Improves plugin detection with a CLI-output fallback to scanning plugin directories. |
| cli/azd/internal/agent/consent/workflow_consent.go | Records consent-scope selection as usage attributes. |
| cli/azd/grpc/proto/copilot.proto | Adds CopilotService protobuf definition (Initialize, ListSessions, SendMessage, metrics, file changes, StopSession). |
| cli/azd/extensions/microsoft.azd.demo/internal/cmd/root.go | Registers new demo copilot command. |
| cli/azd/extensions/microsoft.azd.demo/internal/cmd/copilot.go | Adds demo extension command that exercises CopilotService gRPC API in a chat loop. |
| cli/azd/extensions/microsoft.azd.demo/extension.yaml | Documents the new demo copilot command usage. |
| cli/azd/extensions/azure.ai.agents/version.txt | Bumps extension version to 0.1.16-preview. |
| cli/azd/extensions/azure.ai.agents/extension.yaml | Syncs extension.yaml version to 0.1.16-preview. |
| cli/azd/extensions/azure.ai.agents/CHANGELOG.md | Adds 0.1.16-preview changelog entry. |
| cli/azd/cmd/middleware/error.go | Uses UsageMetrics.String() when displaying agent usage in error middleware; updates factory type to interface. |
| cli/azd/cmd/init.go | Renames InitMethod to copilot, adds environment init method, and records aggregate copilot metrics after session completion. |
| cli/azd/cmd/container.go | Wires CopilotService into IoC container for gRPC server. |
75a1163 to
d86b808
Compare
weikanglim
left a comment
There was a problem hiding this comment.
Reviewed telemetry changes -- Overall looks good. Left a few comments on a few changes we'd want to make for overall completeness.
Add OpenTelemetry spans and usage attributes to track Copilot agent session lifecycle, initialization prompts, message usage, and consent decisions. All instrumentation is in the core agent packages so it works for both direct CLI and gRPC extension framework consumers. Changes: - Define 16 new AttributeKey fields in internal/tracing/fields/fields.go covering session (id, isNew, messageCount), init prompts (isFirstRun, reasoningEffort, model, consentScope), per-message metrics (model, tokens, billingRate, premiumRequests, durationMs), and consent counts (approvedCount, deniedCount) - Add copilot.initialize span in CopilotAgent.Initialize() tracking reasoning level, model selection, and isFirstRun - Add copilot.message span in CopilotAgent.SendMessage() tracking per-message usage metrics and cumulative message count - Add copilot.session span in ensureSession() tracking session creation vs resumption with hashed session ID - Track consent approved/denied counts in permission handler and record as usage attributes on agent Stop() - Track workflow consent scope selection in PromptWorkflowConsent() - Rename InitMethod from 'agent' to 'copilot' in cmd/init.go - Add InitMethod='environment' for previously untracked init branch - Record aggregate copilot metrics as usage attributes in initAppWithAgent Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d86b808 to
898515d
Compare
- Register event names in tracing/events/events.go (CopilotInitializeEvent, CopilotSessionEvent) per convention - Remove per-message copilot.message span — too chatty; emit only session-level metrics - Consolidate all cumulative metrics into Stop() (mode, messageCount, tokens, billing, consent counts, session ID) so telemetry is captured from the core agent package regardless of caller - Remove duplicate aggregate metrics from cmd/init.go — now handled by agent.Stop() - Use defer + named returns in Initialize() and ensureSession() to capture errors via span.EndWithStatus(err) and reduce duplication - Log resumeSessionID in ensureSession defer even on early failure Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Azure Dev CLI Install InstructionsInstall scriptsMacOS/Linux
bash: pwsh: WindowsPowerShell install MSI install Standalone Binary
MSI
Documentationlearn.microsoft.com documentationtitle: Azure Developer CLI reference
|
weikanglim
left a comment
There was a problem hiding this comment.
Looks good -- we can update the session ID to be non-hashed, but otherwise this is good-to-go from my eyes.
Summary
Adds OpenTelemetry telemetry instrumentation to the Copilot agent flow. All instrumentation lives in the core agent packages (
internal/agent/) so telemetry works for both direct CLI integration and gRPC extension framework consumers.Telemetry Fields Added (16 new AttributeKeys)
Session (
copilot.session.*)copilot.session.idcopilot.session.isNewcopilot.session.messageCountInitialization (
copilot.init.*)copilot.init.isFirstRuncopilot.init.reasoningEffortcopilot.init.modelcopilot.init.consentScopeCumulative Usage (
copilot.message.*)copilot.message.modelcopilot.message.inputTokenscopilot.message.outputTokenscopilot.message.billingRatecopilot.message.premiumRequestscopilot.message.durationMscopilot.modeConsent (
copilot.consent.*)copilot.consent.approvedCountcopilot.consent.deniedCountSpans
copilot.initializeCopilotAgent.Initialize()EndWithStatuscopilot.sessionCopilotAgent.ensureSession()Both span names are registered as constants in
tracing/events/events.go.No per-message span is emitted — all cumulative metrics are set as usage attributes in
Stop()to avoid chatty telemetry.Design
Initialize()sets init config attributes (model, reasoning, isFirstRun) via defer + named returnsSendMessage()incrementsmessageCounton success only (no span)Stop()consolidates all cumulative session metrics as usage attributes (tokens, billing, consent counts, session ID, mode, message count)PromptWorkflowConsent()tracks the consent scope selectionInit Telemetry Fixes
InitMethodfrom"agent"to"copilot"InitMethod="environment"for the environment-only init branchTesting
go build ./...- passgo test ./internal/agent/... -short- passgo test ./internal/tracing/... -short- passgolangci-lint run- 0 issues