feat: comprehensive telemetry audit - add command-specific usage attributes#7299
feat: comprehensive telemetry audit - add command-specific usage attributes#7299
Conversation
There was a problem hiding this comment.
Pull request overview
Adds command-specific telemetry attributes and supporting documentation/tests as part of a broader telemetry audit, plus a small auth identity classification tweak and a test flake fix.
Changes:
- Introduces new telemetry attribute keys (auth/config/env/hooks/templates/pipeline/monitor/show/infra) and emits them from several commands.
- Adds audit/spec documentation for the telemetry schema, privacy review triggers, and an audit process/matrix.
- Updates tests (new field contract tests, “coverage” test) and makes
StateCacheManager_TTLdeterministic.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/specs/metrics-audit/telemetry-schema.md | New telemetry schema reference and field catalog |
| docs/specs/metrics-audit/privacy-review-checklist.md | New checklist for privacy review triggers + hashing guidance |
| docs/specs/metrics-audit/feature-telemetry-matrix.md | New command × telemetry coverage matrix |
| docs/specs/metrics-audit/audit-process.md | New recurring telemetry audit process doc + automation suggestions |
| cli/azd/pkg/state/state_cache_test.go | Makes TTL test deterministic by backdating instead of sleeping |
| cli/azd/pkg/auth/manager.go | Adjusts account type detection to if/else-if/else and adds Anonymous fallback |
| cli/azd/internal/tracing/fields/fields.go | Adds new telemetry field keys + AccountTypeAnonymous |
| cli/azd/internal/tracing/fields/fields_audit_test.go | Adds tests validating new field metadata and key behavior |
| cli/azd/internal/cmd/show/show.go | Emits show.output.format usage attribute |
| cli/azd/cmd/templates.go | Emits template.operation usage attribute across template subcommands |
| cli/azd/cmd/telemetry_coverage_test.go | Adds a test intended to act as a telemetry “coverage” gate |
| cli/azd/cmd/pipeline.go | Emits pipeline.provider and pipeline.auth usage attributes |
| cli/azd/cmd/monitor.go | Emits monitor.type usage attribute |
| cli/azd/cmd/infra_generate.go | Emits infra.provider usage attribute when present in project config |
| cli/azd/cmd/hooks.go | Emits hooks.name and hooks.type usage attributes |
| cli/azd/cmd/env_remove.go | Emits env.operation=remove usage attribute |
| cli/azd/cmd/env.go | Emits env.operation across env subcommands + env.count on list |
| cli/azd/cmd/config.go | Emits config.operation across config subcommands |
| cli/azd/cmd/auth_status.go | Emits auth.method=check-status usage attribute |
| cli/azd/cmd/auth_logout.go | Emits auth.method=logout usage attribute |
| cli/azd/cmd/auth_login.go | Emits auth.method for multiple login modes + hashes explicit tenant ID |
kristenwomack
left a comment
There was a problem hiding this comment.
This is fantastic! Left some comments/feedback.
weikanglim
left a comment
There was a problem hiding this comment.
First round of quick review and feedback -- most of the added keys are duplicative of existing data, would like to discuss further.
There was a problem hiding this comment.
quick pass number 2 . I would need more time to review docs changes after.[
@jongio Can we also update the PR to reflect the latest changes
AngelosP
left a comment
There was a problem hiding this comment.
-
Sounds good to me,
-
I'll check with another PM who knows about this stuff a bit more but I think we should be good. Assume we are good, you will hear from me if we are not good.
-
Thank you so much that's awesome! I will play around with this ASAP
…ibutes - Add telemetry to auth, config, env, hooks, templates, pipeline, monitor, show, infra commands - Add 16 new telemetry field constants for command-specific attributes - Fix user identity tracking with Anonymous account type fallback - Fix flaky TestStateCacheManager_TTL timing issue - Add audit documentation: feature matrix, schema, privacy checklist, audit process - Add telemetry field contract tests and CI coverage check Resolves #1772 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Rename test: TestTelemetryFieldConstants with clarified allowlist approach - pipeline.go: skip SetUsageAttributes for empty provider/auth values - docs: fix internal/telemetry/ -> cli/azd/internal/tracing/ paths - docs: add Anonymous to ad.account.type allowed values - docs: add missing legend symbol in feature-telemetry-matrix.md - docs: pick CODEOWNERS over GitHub Actions for telemetry PR labeling - docs: add opt-out rate estimation section with @AngelosP question - cspell: add metrics-audit word list for docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extensions register custom hooks via WithProjectEventHandler/WithServiceEventHandler with arbitrary string names that are not validated against a fixed set. Hash the hook name to prevent potential PII leakage in telemetry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…shing - pipeline.provider: emit 'auto' when user doesn't specify --provider - pipeline.auth.type: emit 'auto' when user doesn't specify --auth-type - infra.provider: emit 'auto' when provider not set in project config - hooks.name: revert to raw string (not hashed) for telemetry readability - audit-process.md: add telemetry validation pipeline section - telemetry-schema.md: document 'auto' as valid value for provider fields Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove attributes that duplicate data already captured by command span names (config.operation, env.operation, template.operation), OTel span status (auth.result), or cmd.flags flag names (monitor.type). Remove show.output.format (should be global, tracked as follow-up). Remove dead code Anonymous fallback in manager.go. 8 unique attributes remain: auth.method, auth.tenant.id.hashed, env.count, hooks.name, hooks.type, infra.provider, pipeline.provider, pipeline.auth.type. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use CiProviderName() to log the actual resolved provider name after auto-detection instead of the 'auto' placeholder. For auth type, only log when explicitly specified — cmd.flags absence indicates auto-detection. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove 'logout' as auth.method value (not an auth method) - Unhash tenant ID (infrastructure GUID, not PII) - Revert unrelated cosmetic change in auth_status.go - Revert unrelated if/else logic change in manager.go - Remove redundant TestFieldKeyValues test - Rename tenant key from ad.tenant.id to auth.tenant.id Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
spboyer
left a comment
There was a problem hiding this comment.
Code Review — Telemetry Audit
4 findings (deduplicated across two independent reviews):
1. TenantIdKey renamed from ad.tenant.id to auth.tenant.id — breaking schema change
File: cli/azd/internal/tracing/fields/fields.go:141
The constant changes from ad.tenant.id to auth.tenant.id, but the schema doc still lists ad.tenant.id, and the sibling keys (ad.account.type, ad.subscription.id) keep the ad. prefix. Any downstream Kusto queries or dashboards expecting ad.tenant.id will silently stop receiving data. Either keep the existing key name or treat this as an explicit migration with matching doc/query updates.
2. pipeline.auth not emitted on the default federated path
File: cli/azd/cmd/pipeline.go:176
Telemetry is gated by if p.flags.PipelineAuthTypeName != "", but both pipeline providers treat empty auth type as the federated default. The most common path records no pipeline.auth at all, making the field systematically incomplete. Emit the resolved auth type after defaulting, not just the raw flag value.
3. Coverage test does not actually enforce command coverage; infra generate already missing
File: cli/azd/cmd/telemetry_coverage_test.go
TestCommandTelemetryCoverage claims to fail when a command appears in neither list, but it never inspects the registered command tree. It only validates internal consistency (no overlap, sorted, no duplicates). Proof: this PR instruments infra generate with InfraProviderKey (and feature-telemetry-matrix.md correctly lists it), yet "infra generate" is absent from both commandsWithSpecificTelemetry and commandsWithOnlyGlobalTelemetry. The test passes anyway. At minimum, add the missing entry. Consider updating the doc comment to clarify this is a manual manifest, not an automated completeness check.
4. auth.method="check-status" emitted but undocumented
File: cli/azd/cmd/auth_login.go:309
The --check-status branch emits tracing.SetUsageAttributes(fields.AuthMethodKey.String("check-status")), but "check-status" is missing from the allowed-value list in telemetry-schema.md and from TestTelemetryFieldConstants. Either add it to the schema and test allowlist, or use a separate field if check-status represents a different operation category.
…rate, undocumented check-status - Update telemetry-schema.md: ad.tenant.id -> auth.tenant.id to match code - Add check-status to auth.method allowed values, remove stale logout - Add missing 'infra generate' to commandsWithSpecificTelemetry manifest Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thanks @spboyer for the thorough review! Addressing each finding: 1. ✅ TenantIdKey schema mismatch — FixedGood catch. The schema doc still referenced 2. ❌ pipeline.auth default path — By designThis was a deliberate design choice made during @weikanglim's review (round 3). We intentionally only log
If we decide we want the resolved auth type in the future, that's tracked as part of the broader enum-flags initiative (#7325). 3. ✅
|
Known built-in hook names (pre/post build, deploy, etc.) are logged raw. Unknown/extension-defined hook names are hashed via SHA-256 to avoid logging arbitrary user input as customer content. Addresses review feedback from @weikanglim on hook name telemetry. Tracks extension hook telemetry gap in issue #7326. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove fields_audit_test.go per Wei's feedback — test duplicated field constants without clear value; snapshot testing would be preferred. - Revert TenantIdKey from auth.tenant.id back to ad.tenant.id to avoid data contract change on existing context-level field. - Update telemetry-schema.md to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Azure Dev CLI Install InstructionsInstall scriptsMacOS/Linux
bash: pwsh: WindowsPowerShell install MSI install Standalone Binary
MSI
Documentationlearn.microsoft.com documentationtitle: Azure Developer CLI reference
|
weikanglim
left a comment
There was a problem hiding this comment.
LGTM -- Thanks for taking the time to address all the feedback
Summary
Comprehensive telemetry audit addressing the gaps identified in #1772. Adds command-specific telemetry attributes to 6 command groups, introduces 7 new field constants with classification metadata, and establishes CI-enforced coverage tracking.
Telemetry Attributes Added (8 fields across 6 command groups)
auth loginauth.methodauth loginauth.tenant.idenv listenv.counthooks runhooks.namehooks runhooks.typepipeline configpipeline.providerpipeline configpipeline.authinfra_generateinfra.providerField Definitions (+71 lines in
fields.go)AttributeKeyconstants withClassificationandPurposemetadataTenantIdKeyfrom hashed to raw (tenant IDs are infrastructure GUIDs, not PII)CI Quality Gates
TestCommandTelemetryCoverage— manifest of all CLI commands classified as either having command-specific telemetry or relying on global middleware. Forces developers to classify new commands.TestTelemetryFieldConstants— regression guard for existing field definitions (key names, types).TestNewFieldConstantsDefined— validates classification/purpose metadata on new fields.Documentation (4 new docs)
docs/specs/metrics-audit/feature-telemetry-matrix.md— command x telemetry coverage matrixdocs/specs/metrics-audit/telemetry-schema.md— schema contract for downstream consumersdocs/specs/metrics-audit/privacy-review-checklist.md— when/how to update privacy reviewdocs/specs/metrics-audit/audit-process.md— recurring audit processBug Fix
TestStateCacheManager_TTL(deterministic backdating instead oftime.Sleep)Remaining Work (blocked)
coreai-microsoft/azure-dev-tools/product-telemetry/azdand LENS platformTesting
go build,go vet,gofmtall cleanResolves #1772