azurebackup: Fix 5 telemetry-triaged bugs + telemetry tag emission + unit tests#2621
Merged
shrja-ms merged 3 commits intoMay 13, 2026
Merged
Conversation
BUG-1 (P0): Add null guard inside MapArmResourceTypeToBackupDataSourceType - Accept string? parameter with IsNullOrEmpty check - Prevents ArgumentNullException from BackupDataSourceType.op_Implicit(null) - Still hit on beta.8/beta.9 despite PR microsoft#2518 caller-side guard BUG-3 (P1): Relax ArmResourceTypeRegex to allow nested resource types - Changed pattern from single-segment to multi-segment: (/[A-Za-z0-9]+)+ - Allows types like Microsoft.Sql/servers/databases - Previously rejected valid ARM types with ArgumentException BUG-6 (P0): Remove ValidateSubscriptionFormat and all 19 call sites - Method rejected subscription names (non-GUID) with FormatException - Violated MCP convention that --subscription accepts names and IDs - Subscription resolution handled by ISubscriptionService in base class NEW-2 (P1): Add FormatException handling in DppBackupOperations.ListJobsAsync - Azure SDK throws FormatException during job deserialization when XmlConvert.ToTimeSpan encounters non-standard ISO 8601 durations - GetJobAsync's fallback to ListJobsAsync was circular (both crash) - Now catches FormatException during enumeration, returns partial results
Activity.AddTag(key, null) is a no-op in .NET - the tag is silently dropped. Since most users don't pass --vault-type (auto-detect), NormalizeVaultType returned null, and AddVaultTags became a no-op. Fix: Return 'auto' instead of null for unspecified vault types, and 'unspecified' for workload types. This ensures the tag is always emitted to the Activity and exported to Kusto/App Insights.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves the Azure Backup tool’s reliability and observability by addressing telemetry-triaged failures (subscription handling, resource type parsing, ARM-type mapping nullability, DPP job listing deserialization errors) and ensuring telemetry tags are always emitted (never null).
Changes:
- Hardened Azure Backup service logic: null-safe ARM resource type mapping and expanded ARM resource type filter validation to allow nested resource types.
- Removed GUID-only subscription validation from RSV operations to align with MCP subscription conventions and avoid eager failures.
- Fixed telemetry tag emission by normalizing unset values to
"auto"/"unspecified", and added unit tests validating normalization + tag constants.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tools/Azure.Mcp.Tools.AzureBackup/src/Services/AzureBackupService.cs | Adds null-guard for ARM resource type mapping and relaxes resource-type regex to allow nested ARM types. |
| tools/Azure.Mcp.Tools.AzureBackup/src/Services/RsvBackupOperations.cs | Removes subscription GUID validation method and its call sites. |
| tools/Azure.Mcp.Tools.AzureBackup/src/Services/DppBackupOperations.cs | Adds FormatException handling during DPP job enumeration to avoid hard failures. |
| tools/Azure.Mcp.Tools.AzureBackup/src/Models/AzureBackupTelemetryTags.cs | Ensures tags emit non-null values by defaulting to "auto" / "unspecified". |
| tools/Azure.Mcp.Tools.AzureBackup/tests/Azure.Mcp.Tools.AzureBackup.UnitTests/Models/AzureBackupTelemetryTagsTests.cs | Adds unit tests for telemetry tag normalization and emission behavior. |
17 new tests covering: - NormalizeVaultType: null/empty -> 'auto', case normalization - NormalizeWorkloadType: null/empty -> 'unspecified', case normalization - AddVaultTags: null activity safety, null vaultType -> 'auto' tag - AddVaultAndWorkloadTags: both tags emitted with correct values - Tag constants: verify azurebackup/* prefix
1161df2 to
53af618
Compare
alzimmermsft
approved these changes
May 12, 2026
srinuthati78
pushed a commit
to srinuthati78/mcp
that referenced
this pull request
May 18, 2026
…unit tests (microsoft#2621) * azurebackup: Fix 4 telemetry-triaged bugs (BUG-1,3,6,NEW-2) BUG-1 (P0): Add null guard inside MapArmResourceTypeToBackupDataSourceType - Accept string? parameter with IsNullOrEmpty check - Prevents ArgumentNullException from BackupDataSourceType.op_Implicit(null) - Still hit on beta.8/beta.9 despite PR microsoft#2518 caller-side guard BUG-3 (P1): Relax ArmResourceTypeRegex to allow nested resource types - Changed pattern from single-segment to multi-segment: (/[A-Za-z0-9]+)+ - Allows types like Microsoft.Sql/servers/databases - Previously rejected valid ARM types with ArgumentException BUG-6 (P0): Remove ValidateSubscriptionFormat and all 19 call sites - Method rejected subscription names (non-GUID) with FormatException - Violated MCP convention that --subscription accepts names and IDs - Subscription resolution handled by ISubscriptionService in base class NEW-2 (P1): Add FormatException handling in DppBackupOperations.ListJobsAsync - Azure SDK throws FormatException during job deserialization when XmlConvert.ToTimeSpan encounters non-standard ISO 8601 durations - GetJobAsync's fallback to ListJobsAsync was circular (both crash) - Now catches FormatException during enumeration, returns partial results * azurebackup: Fix telemetry tags not emitting (null value = no-op) Activity.AddTag(key, null) is a no-op in .NET - the tag is silently dropped. Since most users don't pass --vault-type (auto-detect), NormalizeVaultType returned null, and AddVaultTags became a no-op. Fix: Return 'auto' instead of null for unspecified vault types, and 'unspecified' for workload types. This ensures the tag is always emitted to the Activity and exported to Kusto/App Insights. * azurebackup: Add unit tests for AzureBackupTelemetryTags 17 new tests covering: - NormalizeVaultType: null/empty -> 'auto', case normalization - NormalizeWorkloadType: null/empty -> 'unspecified', case normalization - AddVaultTags: null activity safety, null vaultType -> 'auto' tag - AddVaultAndWorkloadTags: both tags emitted with correct values - Tag constants: verify azurebackup/* prefix
shrja-ms
added a commit
that referenced
this pull request
May 25, 2026
…elemetry skill (#2659) * azurebackup: Fix KQL classification order and update bug tracker in telemetry skill The KQL case() expression in kql-queries.md checked StatusCode 400-499 before exception type, which masked MCP tool bugs (ArgumentNullException, ArgumentException, FormatException) as Customer errors when HandleException mapped them to HTTP 400. This caused the weekly telemetry report to undercount MCP bugs by ~33x (1 vs 34 in the May 11-17 report). Changes: - kql-queries.md: Reorder case() in queries 1, 2, and 4 to check MCP bug exception types (FormatException, ArgumentNullException, ArgumentException, InvalidOperationException) BEFORE StatusCode 400-499 - SKILL.md: Update decision tree to match the classification table (check exception type before status code), add explanatory note about HandleException mapping ArgumentNullException to HTTP 400 - SKILL.md: Update known bug table with current state: - BUG-1: Mark #2518 as partial fix, add #2621 as defense-in-depth - BUG-3: Mark #2518 as partial, add #2621 relaxed regex - Add NEW-2 (job_get FormatException in DPP SDK, azure-sdk-for-net#59306) - Update notes about incomplete #2518 fixes confirmed on beta.8/beta.9 * Address PR review: fix decision tree nesting and use 'pending' for unreleased versions - Add missing NO branch under StatusCode check in decision tree for proper nesting (Copilot review comment 1) - Replace 'beta.11 (pending)' with 'pending' in bug tracker Release column to avoid presuming version number (Copilot review comment 2)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes 5 bugs identified from Kusto telemetry analysis (May 4-11, 2026) and resolves telemetry tag emission failure.
Bug Fixes
backup_statusMapArmResourceTypeToBackupDataSourceType— acceptsstring?withIsNullOrEmptycheck. Still hit on beta.8/beta.9 despite PR #2518 caller-side guard.governance_find-unprotectedArmResourceTypeRegexto allow nested resource types (e.g.,Microsoft.Sql/servers/databases). Changed from single-segment to multi-segment pattern.vault_get+ 18 othersValidateSubscriptionFormatand all 19 call sites. Method rejected subscription names (non-GUID) with FormatException, violating MCP convention.job_getDppBackupOperations.ListJobsAsync. Azure SDK throws during job deserialization (XmlConvert.ToTimeSpan). Fixes circular fallback crash.Activity.AddTag(key, null)is a no-op in .NET. Return"auto"/"unspecified"instead of null for unset values.New Tests
AzureBackupTelemetryTagscoveringNormalizeVaultType,NormalizeWorkloadType,AddVaultTags,AddVaultAndWorkloadTags, and tag constants.Validation
Files Changed
Services/AzureBackupService.csServices/RsvBackupOperations.csServices/DppBackupOperations.csModels/AzureBackupTelemetryTags.csTests/Models/AzureBackupTelemetryTagsTests.csInvoking Livetests
Copilot submitted PRs are not trustworthy by default. Users with
writeaccess to the repo need to validate the contents of this PR before leaving a comment with the text/azp run mcp - pullrequest - live. This will trigger the necessary livetest workflows to complete required validation.