Skip to content

fix(server): k8s patch_sandbox_metadata correctly deletes keys and returns post-patch state#899

Merged
hittyt merged 2 commits into
alibaba:mainfrom
Pangjiping:fix/server/k8s-patch-metadata-merge-and-stale-read
May 17, 2026
Merged

fix(server): k8s patch_sandbox_metadata correctly deletes keys and returns post-patch state#899
hittyt merged 2 commits into
alibaba:mainfrom
Pangjiping:fix/server/k8s-patch-metadata-merge-and-stale-read

Conversation

@Pangjiping
Copy link
Copy Markdown
Collaborator

Summary

Two bugs in KubernetesSandboxService.patch_sandbox_metadata caused the nightly k8s mini E2E test_02_metadata_filter_and_logic to fail:

  1. JSON merge patch (RFC 7396) on metadata.labels merges keys recursively — keys absent from the patch body are kept. The previous code computed the desired final labels dict (with deleted keys removed) and sent it, so deleted keys were never actually removed on the API server.

  2. After the PATCH, the code re-fetched the workload via _get_workload_or_404, which goes through K8sClient.get_custom_object that prefers the informer cache. The informer is eventually consistent, so the read could land before the watch event arrived and return the pre-patch labels.

Fix both by:

  • Building the merge-patch body with explicit None for deleted keys.
  • Using the API server's PATCH response (returned from patch_labels) directly, instead of re-reading via the cache.

WorkloadProvider.patch_labels now accepts Dict[str, Optional[str]] and returns the patched workload dict.

Testing

  • Not run (explain why)
  • Unit tests
  • Integration tests
  • e2e / manual verification

Breaking Changes

  • None
  • Yes (describe impact and migration path)

Checklist

  • Linked Issue or clearly described motivation
  • Added/updated docs (if needed)
  • Added/updated tests (if needed)
  • Security impact considered
  • Backward compatibility considered

…turns post-patch state

Two bugs in KubernetesSandboxService.patch_sandbox_metadata caused the
nightly k8s mini E2E test_02_metadata_filter_and_logic to fail:

1. JSON merge patch (RFC 7396) on metadata.labels merges keys recursively
   — keys absent from the patch body are kept. The previous code computed
   the desired final labels dict (with deleted keys removed) and sent it,
   so deleted keys were never actually removed on the API server.

2. After the PATCH, the code re-fetched the workload via
   _get_workload_or_404, which goes through K8sClient.get_custom_object
   that prefers the informer cache. The informer is eventually consistent,
   so the read could land before the watch event arrived and return the
   pre-patch labels.

Fix both by:
- Building the merge-patch body with explicit None for deleted keys.
- Using the API server's PATCH response (returned from patch_labels)
  directly, instead of re-reading via the cache.

WorkloadProvider.patch_labels now accepts Dict[str, Optional[str]] and
returns the patched workload dict.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Pangjiping Pangjiping requested a review from hittyt as a code owner May 17, 2026 05:43
@Pangjiping Pangjiping added the bug Something isn't working label May 17, 2026
The k8s-nightly-build workflow only runs on PRs that touch one of its
trigger paths. This PR fixes server-side k8s logic but does not modify
those paths, so add a date stamp to the existing trigger comment to
include this PR in the matrix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@hittyt hittyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hittyt hittyt merged commit 7e57438 into alibaba:main May 17, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working component/server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants