Improve telemetry error classification with typed sentinels#7051
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves azd telemetry error classification by unwrapping ErrorWithSuggestion before mapping and by introducing/using typed sentinel errors so command failures no longer collapse into opaque buckets like internal.errors_errorString.
Changes:
- Update
MapErrorto classify the inner error when wrapped inErrorWithSuggestion. - Add a set of typed sentinel errors in
internal/errors.goand use them across commandRun()paths (wrapping with%w+ adding user suggestions). - Add/extend tests to enforce error mapping coverage and prevent new “bare” errors in
Run()methods.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| cli/azd/internal/errors.go | Adds new internal sentinel errors used for telemetry result-code mapping. |
| cli/azd/internal/cmd/errors.go | Updates MapError to unwrap ErrorWithSuggestion first and maps new sentinels to stable result codes. |
| cli/azd/internal/cmd/errors_test.go | Adds many mapping test cases and AST-based enforcement to prevent new bare errors. |
| cli/azd/internal/cmd/show/show.go | Returns ErrorWithSuggestion for missing explicitly named environments. |
| cli/azd/internal/cmd/deploy.go | Replaces bare errors with sentinels + suggestions for deploy validation paths. |
| cli/azd/internal/cmd/publish.go | Replaces bare errors with sentinels + suggestions for publish validation paths. |
| cli/azd/internal/cmd/provision.go | Replaces bare errors with sentinels + suggestions for provision validation paths. |
| cli/azd/cmd/update.go | Uses sentinel + suggestion for unsupported update on non-prod builds. |
| cli/azd/cmd/up.go | Uses sentinel + suggestion for subscription/location change validation. |
| cli/azd/cmd/templates.go | Uses sentinel + suggestion for template source validation failures. |
| cli/azd/cmd/monitor.go | Uses sentinel + suggestion for missing infra/resource preconditions. |
| cli/azd/cmd/mcp.go | Uses sentinel + suggestion for MCP tool load and flag validation errors. |
| cli/azd/cmd/init.go | Uses sentinel + suggestion for init mode/flag validation errors. |
| cli/azd/cmd/hooks.go | Uses sentinel + suggestion for invalid service name. |
| cli/azd/cmd/extensions.go | Uses sentinel + suggestion for missing extension annotation / token generation. |
| cli/azd/cmd/extension.go | Uses sentinels + suggestions throughout extension commands; replaces bare validation errors. |
| cli/azd/cmd/env_remove.go | Uses sentinel + suggestion for missing env arg / env not found. |
| cli/azd/cmd/env.go | Uses sentinels + suggestions for env arg/flag validation and config lookup failures. |
| cli/azd/cmd/env_config_test.go | Updates expected error substring to match new message. |
| cli/azd/cmd/config.go | Uses sentinel + suggestion for missing config key. |
| cli/azd/cmd/completion.go | Uses sentinel + suggestion for unsupported shell completion target. |
| cli/azd/cmd/auth_token.go | Uses sentinel + suggestion for invalid base64 claims. |
| cli/azd/cmd/auth_login.go | Uses sentinel + suggestion for delegated auth-mode login disabled scenario. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
dd89895 to
9a610cb
Compare
Looked at error telemetry for azd 1.23.7 and 1.23.8. About 15k were just 'internal.errors_errorString' and ~3.6k were 'error.suggestion' — not useful for debugging. Why this was happening: - MapError was matching ErrorWithSuggestion wrapper before looking at the real error inside, so things like auth and ARM failures got misclassified. - Most Run() methods used bare errors.New or fmt.Errorf with no typed error, so MapError had nothing to match on and dumped them in the catch-all. What this does: - Unwrap ErrorWithSuggestion first so the real error gets classified - Add 28 typed errors covering all commands - Fix all 54 bare error sites to wrap typed errors with %w - Add ErrorWithSuggestion at each site for better agent/CLI experience - Add an AST test that blocks new bare errors from being added Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Return base errors instead of wrapping with unhelpful 'This is an internal error...' suggestion text. Addresses JeffreyCA review feedback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restore original error.suggestion + error.type classification. Wei confirmed the empty error.type was a backend GDPR classification issue, now fixed. The original design is correct: ErrorWithSuggestion produces error.suggestion as ResultCode with the inner error type in the error.type attribute. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
a02e377 to
3c246db
Compare
Azure Dev CLI Install InstructionsInstall scriptsMacOS/Linux
bash: pwsh: WindowsPowerShell install MSI install Standalone Binary
MSI
Documentationlearn.microsoft.com documentationtitle: Azure Developer CLI reference
|
Problem
Telemetry for azd 1.23.7/1.23.8 showed ~15k errors as
internal.errors_errorString— the catch-all bucket whenMapErrorcan't classify an error. MostRun()methods used bareerrors.Neworfmt.Errorfwith no typed error, soMapErrorhad nothing to match on.Additionally, errors wrapped in
ErrorWithSuggestiononly showederror.suggestionas the ResultCode, witherror.typeempty due to a backend GDPR classification gap (now fixed by @weikanglim).What this does
internal/errors.gocovering all command domains (env, auth, config, extensions, deploy, etc.)ErrorWithSuggestionfor both telemetry classification and user-facing suggestionsclassifySentinel()helper inMapError— when an error hits theerror.suggestionbranch, this populateserror.typewith a descriptive code (e.g.internal.key_not_found) instead of the raw Go typeTest_RunMethodsNoBareErrors— scans all actionRun()methods, empty allowlistTest_PackageLevelErrorsMapped— ensures every sentinel ininternal/errors.goappears inclassifySentinel()Telemetry result
Errors now show up as:
ResultCode: error.suggestionwitherror.type: internal.key_not_found(for ErrorWithSuggestion-wrapped sentinels)ResultCode: internal.extension_not_found(for naked sentinel returns)Instead of the opaque
internal.errors_errorString.Testing
Test_MapError— covers all sentinel paths, ErrorWithSuggestion wrapping, and catch-all guardrailsTestMapError_ErrorWithSuggestionSetsErrorType— verifiesclassifySentinel()populateserror.typeTest_PackageLevelErrorsMappedandTest_RunMethodsNoBareErrors— AST enforcementenv_test.go)--trace-log-file