Skip to content

Add TelemetryConfigRef support to VirtualMCPServer CRD#4801

Merged
ChrisJBurns merged 5 commits intomainfrom
worktree-vmcp-telemetry
Apr 14, 2026
Merged

Add TelemetryConfigRef support to VirtualMCPServer CRD#4801
ChrisJBurns merged 5 commits intomainfrom
worktree-vmcp-telemetry

Conversation

@ChrisJBurns
Copy link
Copy Markdown
Collaborator

@ChrisJBurns ChrisJBurns commented Apr 14, 2026

Summary

VirtualMCPServer only supported inline telemetry configuration via spec.config.telemetry, while MCPServer and MCPRemoteProxy already support referencing shared MCPTelemetryConfig resources. This gap means VirtualMCPServer users cannot leverage Kubernetes-native secret references for OTLP auth headers, CA bundle ConfigMap references for internal PKI, or per-server serviceName overrides for observability isolation.

  • Add spec.telemetryConfigRef field to reference shared MCPTelemetryConfig resources
  • Add CEL validation enforcing mutual exclusivity with inline config.telemetry
  • Add TelemetryConfigHash to status for change detection and rolling updates
  • Add telemetry handler using the batched statusManager pattern
  • Add MCPTelemetryConfig watch to trigger reconciliation on config changes
  • Update converter to prefer TelemetryConfigRef over inline telemetry
  • Add CA bundle volumes and sensitive header env vars to the deployment builder
  • Regenerate CRD manifests, RBAC, deepcopy, and mocks

Closes #4792

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)

Changes

File Change
api/v1alpha1/virtualmcpserver_types.go Add TelemetryConfigRef field, CEL validation, status hash, condition constants
pkg/virtualmcpserverstatus/types.go Add SetTelemetryConfigHash and SetTelemetryConfigRefValidatedCondition to interface
pkg/virtualmcpserverstatus/collector.go Implement new interface methods, apply hash in UpdateStatus
pkg/controllerutil/config.go Add GetTelemetryConfigForVirtualMCPServer() helper
controllers/virtualmcpserver_telemetryconfig.go New: handler + watch mapper for MCPTelemetryConfig
controllers/virtualmcpserver_controller.go RBAC marker, handleConfigRefs extraction, MCPTelemetryConfig watch
controllers/virtualmcpserver_deployment.go CA bundle volumes + sensitive header env vars
pkg/vmcpconfig/converter.go Branch for TelemetryConfigRef vs inline, extracted helpers
controllers/virtualmcpserver_telemetryconfig_test.go New: 6 test cases for handler + watch
pkg/controllerutil/config_test.go 4 tests for GetTelemetryConfigForVirtualMCPServer
pkg/virtualmcpserverstatus/collector_test.go 3 tests for telemetry hash + condition
pkg/vmcpconfig/converter_test.go Test for Convert with TelemetryConfigRef

Does this introduce a user-facing change?

Yes. VirtualMCPServer now supports spec.telemetryConfigRef to reference shared MCPTelemetryConfig resources, consistent with MCPServer and MCPRemoteProxy. The existing inline config.telemetry field continues to work but is deprecated.

Implementation plan

Approved implementation plan

See the full plan at: #4792

Key design decisions:

  • Top-level spec.telemetryConfigRef (not nested under spec.config) for consistency with MCPServer/MCPRemoteProxy
  • Uses batched StatusManager pattern (not direct r.Status().Update()) matching VirtualMCPServer conventions
  • Converter fetches MCPTelemetryConfig after validation in handleTelemetryConfig, so it's guaranteed to exist
  • CA bundle volumes and sensitive header env vars added in deployment builder (physical deployment concerns)
  • All existing shared helpers reused: AddTelemetryCABundleVolumes, GenerateOpenTelemetryEnvVarsFromRef, NormalizeMCPTelemetryConfig

Generated with Claude Code

Large PR Justification

This is a single feature (TelemetryConfigRef for VirtualMCPServer) that requires coordinated changes across CRD types, controller handler, converter, deployment builder, MCPTelemetryConfig controller (deletion protection), status management, integration tests, documentation, and example manifests. Splitting into smaller PRs would produce incomplete/non-functional intermediate states — e.g., adding the CRD field without the controller handler leaves a field that does nothing, or adding the handler without the MCPTelemetryConfig controller update leaves a deletion protection gap.

@github-actions github-actions bot added the size/L Large PR: 600-999 lines changed label Apr 14, 2026
@ChrisJBurns ChrisJBurns requested a review from amirejaz as a code owner April 14, 2026 10:50
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Apr 14, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 62.50000% with 69 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.93%. Comparing base (2dedbae) to head (aad9236).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...rator/controllers/mcptelemetryconfig_controller.go 5.55% 32 Missing and 2 partials ⚠️
...or/controllers/virtualmcpserver_telemetryconfig.go 80.64% 9 Missing and 3 partials ⚠️
...perator/controllers/virtualmcpserver_controller.go 45.00% 6 Missing and 5 partials ⚠️
...perator/controllers/virtualmcpserver_deployment.go 11.11% 6 Missing and 2 partials ⚠️
cmd/thv-operator/pkg/controllerutil/config.go 84.61% 1 Missing and 1 partial ⚠️
cmd/thv-operator/pkg/vmcpconfig/converter.go 94.28% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4801      +/-   ##
==========================================
- Coverage   69.01%   68.93%   -0.08%     
==========================================
  Files         517      518       +1     
  Lines       54829    54980     +151     
==========================================
+ Hits        37840    37901      +61     
- Misses      14071    14154      +83     
- Partials     2918     2925       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jhrozek
jhrozek previously approved these changes Apr 14, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/L Large PR: 600-999 lines changed labels Apr 14, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

Copy link
Copy Markdown
Collaborator Author

@ChrisJBurns ChrisJBurns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-Agent Consensus Review

Agents consulted: kubernetes-expert, code-reviewer, toolhive-expert, go-security-reviewer, go-expert-developer

Consensus Summary

# Finding Consensus Severity Action
1 MCPTelemetryConfig controller missing VirtualMCPServer in findReferencingWorkloads 8/10 HIGH Fix
2 Deployment builder silently swallows telemetry fetch errors 8/10 MEDIUM Discuss
3 Redundant MCPTelemetryConfig API calls (3x per reconcile) 9/10 MEDIUM Fix (or follow-up)
4 Deprecation log at wrong level + fires every reconciliation 8/10 MEDIUM Fix
5 normalizeTelemetry returns nil,nil when ref set but config not found 7/10 LOW Fix
6 Architecture docs not updated (Mermaid diagram + "Referenced by") 8/10 MEDIUM Fix
7 Missing condition cleanup when TelemetryConfigRef is removed 7/10 MEDIUM Fix
8 Double error wrapping in normalizeTelemetry 7/10 LOW Fix

Overall

The implementation is well-structured and closely follows established MCPServer/MCPRemoteProxy telemetry config ref patterns. The CRD types, CEL validation, controller handler, watch mapper, status management, and converter are all consistent with existing conventions. The test coverage (unit + integration) is solid and covers the key scenarios.

The main gap is that the MCPTelemetryConfig controller's findReferencingWorkloads was not updated to recognize VirtualMCPServer as a referencing workload. This means deletion protection via finalizer won't block deletion of an MCPTelemetryConfig that a VirtualMCPServer still references — a functional gap in the feature being added. The remaining findings are non-blocking improvements: reducing redundant API calls (3x per reconcile), fixing the deprecation log level/frequency, cleaning up stale conditions on ref removal, and updating architecture docs.

Documentation

  • docs/arch/09-operator-architecture.md — Missing VMCP -.->|telemetryConfigRef| TelCfg edge in the Mermaid diagram (line ~131). The "Referenced by" text for MCPTelemetryConfig (line ~246) must now include VirtualMCPServer.
  • docs/arch/10-virtual-mcp-architecture.md — No mention of telemetryConfigRef support. Add a brief note.

Generated with Claude Code

ChrisJBurns added a commit that referenced this pull request Apr 14, 2026
Fixes from code review on PR #4801 for issue #4792:

- Add VirtualMCPServer to MCPTelemetryConfig controller's findReferencingWorkloads,
  watch handler, and RBAC markers for deletion protection parity
- Eliminate redundant MCPTelemetryConfig API calls (3x per reconcile) by fetching
  once in handleTelemetryConfig and threading the config through to converter and
  deployment builder
- Change normalizeTelemetry to accept pre-fetched config instead of re-fetching
- Fix deprecation log level from Info to V(1) debug to avoid log flooding
- Remove stale TelemetryConfigRefValidated condition when TelemetryConfigRef is nil
- Update architecture docs with VirtualMCPServer telemetryConfigRef edges

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ChrisJBurns ChrisJBurns requested a review from rdimitrov as a code owner April 14, 2026 12:14
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 14, 2026
@github-actions github-actions bot dismissed their stale review April 14, 2026 12:14

Large PR justification has been provided. Thank you!

@github-actions
Copy link
Copy Markdown
Contributor

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 14, 2026
ChrisJBurns and others added 5 commits April 14, 2026 13:43
VirtualMCPServer can now reference shared MCPTelemetryConfig resources
via spec.telemetryConfigRef, following the same pattern used by
MCPServer and MCPRemoteProxy. This enables shared telemetry
configuration, Kubernetes-native secret references for OTLP auth
headers, CA bundle ConfigMap references, and per-server serviceName
overrides.

Implements changes for issue #4792:
- Add TelemetryConfigRef field and CEL mutual exclusivity validation
- Add TelemetryConfigHash to status for change detection
- Add telemetry handler with batched statusManager pattern
- Add MCPTelemetryConfig watch in SetupWithManager
- Update converter to prefer TelemetryConfigRef over inline telemetry
- Add CA bundle volumes and sensitive header env vars to deployment
- Regenerate CRD manifests, RBAC, deepcopy, and mocks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add deprecation comments to config.Config.Telemetry field (for operator
context) and emit a warning log when inline telemetry is used during
conversion, guiding operators to migrate to spec.telemetryConfigRef.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add envtest-based integration tests for VirtualMCPServer
TelemetryConfigRef: hash tracking, config change detection, missing
ref condition, and CEL mutual exclusivity validation.

Update observability docs to show the preferred telemetryConfigRef
pattern for VirtualMCPServer and mark inline config.telemetry as
deprecated. Add example manifest demonstrating shared
MCPTelemetryConfig with VirtualMCPServer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes from code review on PR #4801 for issue #4792:

- Add VirtualMCPServer to MCPTelemetryConfig controller's findReferencingWorkloads,
  watch handler, and RBAC markers for deletion protection parity
- Eliminate redundant MCPTelemetryConfig API calls (3x per reconcile) by fetching
  once in handleTelemetryConfig and threading the config through to converter and
  deployment builder
- Change normalizeTelemetry to accept pre-fetched config instead of re-fetching
- Fix deprecation log level from Info to V(1) debug to avoid log flooding
- Remove stale TelemetryConfigRefValidated condition when TelemetryConfigRef is nil
- Update architecture docs with VirtualMCPServer telemetryConfigRef edges

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ChrisJBurns ChrisJBurns force-pushed the worktree-vmcp-telemetry branch from a3bf1f3 to aad9236 Compare April 14, 2026 12:49
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 14, 2026
@ChrisJBurns ChrisJBurns merged commit fa955cd into main Apr 14, 2026
41 checks passed
@ChrisJBurns ChrisJBurns deleted the worktree-vmcp-telemetry branch April 14, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

VirtualMCPServer to use MCPTelemetryConfig

2 participants