Skip to content

azurebackup: Fix 5 telemetry-triaged bugs + telemetry tag emission + unit tests#2621

Merged
shrja-ms merged 3 commits into
microsoft:mainfrom
shrja-ms:user/azurebackup-telemetry-bugfixes-v2
May 13, 2026
Merged

azurebackup: Fix 5 telemetry-triaged bugs + telemetry tag emission + unit tests#2621
shrja-ms merged 3 commits into
microsoft:mainfrom
shrja-ms:user/azurebackup-telemetry-bugfixes-v2

Conversation

@shrja-ms
Copy link
Copy Markdown
Contributor

Summary

Fixes 5 bugs identified from Kusto telemetry analysis (May 4-11, 2026) and resolves telemetry tag emission failure.

Bug Fixes

Bug Severity Tool Fix
BUG-1 P0 backup_status Add null guard inside MapArmResourceTypeToBackupDataSourceType — accepts string? with IsNullOrEmpty check. Still hit on beta.8/beta.9 despite PR #2518 caller-side guard.
BUG-3 P1 governance_find-unprotected Relax ArmResourceTypeRegex to allow nested resource types (e.g., Microsoft.Sql/servers/databases). Changed from single-segment to multi-segment pattern.
BUG-6 P0 vault_get + 18 others Remove ValidateSubscriptionFormat and all 19 call sites. Method rejected subscription names (non-GUID) with FormatException, violating MCP convention.
NEW-2 P1 job_get Add FormatException handling in DppBackupOperations.ListJobsAsync. Azure SDK throws during job deserialization (XmlConvert.ToTimeSpan). Fixes circular fallback crash.
Telemetry P1 All Fix telemetry tags not emitting — Activity.AddTag(key, null) is a no-op in .NET. Return "auto"/"unspecified" instead of null for unset values.

New Tests

  • 17 new unit tests for AzureBackupTelemetryTags covering NormalizeVaultType, NormalizeWorkloadType, AddVaultTags, AddVaultAndWorkloadTags, and tag constants.

Validation

  • Build: 0 errors, 0 warnings
  • Unit tests: 406 passed (389 existing + 17 new)
  • Live tests (Playback): 55 passed, 2 skipped, 0 failed
  • Format check: Clean
  • Server build: Succeeded

Files Changed

File Changes
Services/AzureBackupService.cs BUG-1: null guard + BUG-3: regex fix
Services/RsvBackupOperations.cs BUG-6: remove method + 19 call sites
Services/DppBackupOperations.cs NEW-2: FormatException handling
Models/AzureBackupTelemetryTags.cs Telemetry: null to "auto"/"unspecified"
Tests/Models/AzureBackupTelemetryTagsTests.cs 17 new unit tests

Invoking Livetests

Copilot submitted PRs are not trustworthy by default. Users with write access to the repo need to validate the contents of this PR before leaving a comment with the text /azp run mcp - pullrequest - live. This will trigger the necessary livetest workflows to complete required validation.

shrja-ms added 2 commits May 11, 2026 14:04
BUG-1 (P0): Add null guard inside MapArmResourceTypeToBackupDataSourceType
  - Accept string? parameter with IsNullOrEmpty check
  - Prevents ArgumentNullException from BackupDataSourceType.op_Implicit(null)
  - Still hit on beta.8/beta.9 despite PR microsoft#2518 caller-side guard

BUG-3 (P1): Relax ArmResourceTypeRegex to allow nested resource types
  - Changed pattern from single-segment to multi-segment: (/[A-Za-z0-9]+)+
  - Allows types like Microsoft.Sql/servers/databases
  - Previously rejected valid ARM types with ArgumentException

BUG-6 (P0): Remove ValidateSubscriptionFormat and all 19 call sites
  - Method rejected subscription names (non-GUID) with FormatException
  - Violated MCP convention that --subscription accepts names and IDs
  - Subscription resolution handled by ISubscriptionService in base class

NEW-2 (P1): Add FormatException handling in DppBackupOperations.ListJobsAsync
  - Azure SDK throws FormatException during job deserialization when
    XmlConvert.ToTimeSpan encounters non-standard ISO 8601 durations
  - GetJobAsync's fallback to ListJobsAsync was circular (both crash)
  - Now catches FormatException during enumeration, returns partial results
Activity.AddTag(key, null) is a no-op in .NET - the tag is silently
dropped. Since most users don't pass --vault-type (auto-detect),
NormalizeVaultType returned null, and AddVaultTags became a no-op.

Fix: Return 'auto' instead of null for unspecified vault types, and
'unspecified' for workload types. This ensures the tag is always
emitted to the Activity and exported to Kusto/App Insights.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the Azure Backup tool’s reliability and observability by addressing telemetry-triaged failures (subscription handling, resource type parsing, ARM-type mapping nullability, DPP job listing deserialization errors) and ensuring telemetry tags are always emitted (never null).

Changes:

  • Hardened Azure Backup service logic: null-safe ARM resource type mapping and expanded ARM resource type filter validation to allow nested resource types.
  • Removed GUID-only subscription validation from RSV operations to align with MCP subscription conventions and avoid eager failures.
  • Fixed telemetry tag emission by normalizing unset values to "auto" / "unspecified", and added unit tests validating normalization + tag constants.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tools/Azure.Mcp.Tools.AzureBackup/src/Services/AzureBackupService.cs Adds null-guard for ARM resource type mapping and relaxes resource-type regex to allow nested ARM types.
tools/Azure.Mcp.Tools.AzureBackup/src/Services/RsvBackupOperations.cs Removes subscription GUID validation method and its call sites.
tools/Azure.Mcp.Tools.AzureBackup/src/Services/DppBackupOperations.cs Adds FormatException handling during DPP job enumeration to avoid hard failures.
tools/Azure.Mcp.Tools.AzureBackup/src/Models/AzureBackupTelemetryTags.cs Ensures tags emit non-null values by defaulting to "auto" / "unspecified".
tools/Azure.Mcp.Tools.AzureBackup/tests/Azure.Mcp.Tools.AzureBackup.UnitTests/Models/AzureBackupTelemetryTagsTests.cs Adds unit tests for telemetry tag normalization and emission behavior.

17 new tests covering:
- NormalizeVaultType: null/empty -> 'auto', case normalization
- NormalizeWorkloadType: null/empty -> 'unspecified', case normalization
- AddVaultTags: null activity safety, null vaultType -> 'auto' tag
- AddVaultAndWorkloadTags: both tags emitted with correct values
- Tag constants: verify azurebackup/* prefix
@shrja-ms shrja-ms force-pushed the user/azurebackup-telemetry-bugfixes-v2 branch from 1161df2 to 53af618 Compare May 11, 2026 10:24
@shrja-ms shrja-ms merged commit dde68c9 into microsoft:main May 13, 2026
15 checks passed
@github-project-automation github-project-automation Bot moved this from Untriaged to Done in Azure MCP Server May 13, 2026
srinuthati78 pushed a commit to srinuthati78/mcp that referenced this pull request May 18, 2026
…unit tests (microsoft#2621)

* azurebackup: Fix 4 telemetry-triaged bugs (BUG-1,3,6,NEW-2)

BUG-1 (P0): Add null guard inside MapArmResourceTypeToBackupDataSourceType
  - Accept string? parameter with IsNullOrEmpty check
  - Prevents ArgumentNullException from BackupDataSourceType.op_Implicit(null)
  - Still hit on beta.8/beta.9 despite PR microsoft#2518 caller-side guard

BUG-3 (P1): Relax ArmResourceTypeRegex to allow nested resource types
  - Changed pattern from single-segment to multi-segment: (/[A-Za-z0-9]+)+
  - Allows types like Microsoft.Sql/servers/databases
  - Previously rejected valid ARM types with ArgumentException

BUG-6 (P0): Remove ValidateSubscriptionFormat and all 19 call sites
  - Method rejected subscription names (non-GUID) with FormatException
  - Violated MCP convention that --subscription accepts names and IDs
  - Subscription resolution handled by ISubscriptionService in base class

NEW-2 (P1): Add FormatException handling in DppBackupOperations.ListJobsAsync
  - Azure SDK throws FormatException during job deserialization when
    XmlConvert.ToTimeSpan encounters non-standard ISO 8601 durations
  - GetJobAsync's fallback to ListJobsAsync was circular (both crash)
  - Now catches FormatException during enumeration, returns partial results

* azurebackup: Fix telemetry tags not emitting (null value = no-op)

Activity.AddTag(key, null) is a no-op in .NET - the tag is silently
dropped. Since most users don't pass --vault-type (auto-detect),
NormalizeVaultType returned null, and AddVaultTags became a no-op.

Fix: Return 'auto' instead of null for unspecified vault types, and
'unspecified' for workload types. This ensures the tag is always
emitted to the Activity and exported to Kusto/App Insights.

* azurebackup: Add unit tests for AzureBackupTelemetryTags

17 new tests covering:
- NormalizeVaultType: null/empty -> 'auto', case normalization
- NormalizeWorkloadType: null/empty -> 'unspecified', case normalization
- AddVaultTags: null activity safety, null vaultType -> 'auto' tag
- AddVaultAndWorkloadTags: both tags emitted with correct values
- Tag constants: verify azurebackup/* prefix
shrja-ms added a commit that referenced this pull request May 25, 2026
…elemetry skill (#2659)

* azurebackup: Fix KQL classification order and update bug tracker in telemetry skill

The KQL case() expression in kql-queries.md checked StatusCode 400-499
before exception type, which masked MCP tool bugs (ArgumentNullException,
ArgumentException, FormatException) as Customer errors when HandleException
mapped them to HTTP 400. This caused the weekly telemetry report to
undercount MCP bugs by ~33x (1 vs 34 in the May 11-17 report).

Changes:
- kql-queries.md: Reorder case() in queries 1, 2, and 4 to check MCP bug
  exception types (FormatException, ArgumentNullException, ArgumentException,
  InvalidOperationException) BEFORE StatusCode 400-499
- SKILL.md: Update decision tree to match the classification table (check
  exception type before status code), add explanatory note about
  HandleException mapping ArgumentNullException to HTTP 400
- SKILL.md: Update known bug table with current state:
  - BUG-1: Mark #2518 as partial fix, add #2621 as defense-in-depth
  - BUG-3: Mark #2518 as partial, add #2621 relaxed regex
  - Add NEW-2 (job_get FormatException in DPP SDK, azure-sdk-for-net#59306)
  - Update notes about incomplete #2518 fixes confirmed on beta.8/beta.9

* Address PR review: fix decision tree nesting and use 'pending' for unreleased versions

- Add missing NO branch under StatusCode check in decision tree for
  proper nesting (Copilot review comment 1)
- Replace 'beta.11 (pending)' with 'pending' in bug tracker Release
  column to avoid presuming version number (Copilot review comment 2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants