-
Notifications
You must be signed in to change notification settings - Fork 228
vsphere: Cache REST API sessions to prevent excessive vCenter logouts #1432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
vsphere: Cache REST API sessions to prevent excessive vCenter logouts #1432
Conversation
This commit fixes a critical issue where the machine-api-operator was creating and destroying vCenter REST API sessions on every machine reconciliation, causing excessive login/logout cycles that pollute vCenter audit logs and create unnecessary session churn. Root Cause: The WithRestClient() and WithCachingTagsManager() wrapper functions were creating new REST sessions, performing operations, and immediately logging out on every invocation. With hundreds of machines reconciling periodically, this created a constant stream of login/logout events. Solution (inspired by cluster-api-provider-vsphere): - Add TagManager field to Session struct to cache REST client - Initialize and cache REST client during session creation (GetOrCreate) - Validate both SOAP and REST session health before reusing cached sessions - Add GetCachingTagsManager() helper for direct access to cached tag manager - Update reconcileRegionAndZoneLabels() to use cached tag manager - Update reconcileTags() to use cached tag manager - Deprecate WithRestClient() and WithCachingTagsManager() for backward compatibility Key Changes: 1. pkg/controller/vsphere/session/session.go: - Added TagManager *tags.Manager field to Session struct - Modified GetOrCreate() to create and cache REST client once - Added dual session validation (SOAP + REST) before reusing sessions - Added GetCachingTagsManager() method for direct access - Deprecated old wrapper functions with migration guidance 2. pkg/controller/vsphere/reconciler.go: - Updated reconcileRegionAndZoneLabels() to use GetCachingTagsManager() - Updated reconcileTags() to use GetCachingTagsManager() - Eliminated callback pattern in favor of direct access Impact: - Eliminates excessive vCenter login/logout cycles - Reduces vCenter session churn from O(reconciliations) to O(1) per MAPI instance - Improves performance by removing authentication overhead on every tag operation - REST session now lives as long as SOAP session (until invalidation) Reference Implementation: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/pkg/session/session.go Backward Compatibility: The deprecated wrapper functions are maintained with warning logs to support existing test code. All production code paths now use the new pattern. Fixes: Excessive vCenter logout events reported by customers Signed-off-by: Claude Code Assistant <noreply@anthropic.com>
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test ? |
|
@jcpowermac: The following commands are available to trigger required jobs: The following commands are available to trigger optional jobs: Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test e2e-vsphere-ovn-serial |
|
/test e2e-vsphere-ovn |
|
@jcpowermac: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This commit fixes a critical issue where the machine-api-operator was creating and destroying vCenter REST API sessions on every machine reconciliation, causing excessive login/logout cycles that pollute vCenter audit logs and create unnecessary session churn.
Root Cause:
The WithRestClient() and WithCachingTagsManager() wrapper functions were creating new REST sessions, performing operations, and immediately logging out on every invocation. With hundreds of machines reconciling periodically, this created a constant stream of login/logout events.
Solution (inspired by cluster-api-provider-vsphere):
Key Changes:
pkg/controller/vsphere/session/session.go:
pkg/controller/vsphere/reconciler.go:
Impact:
Reference Implementation:
https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/pkg/session/session.go
Backward Compatibility:
The deprecated wrapper functions are maintained with warning logs to support existing test code. All production code paths now use the new pattern.
Fixes: Excessive vCenter logout events reported by customers