Wire MCPServer to MCPTelemetryConfig with inline deprecation#4482
Wire MCPServer to MCPTelemetryConfig with inline deprecation#4482ChrisJBurns merged 6 commits intomainfrom
Conversation
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4482 +/- ##
==========================================
- Coverage 69.43% 69.31% -0.12%
==========================================
Files 499 501 +2
Lines 51136 51339 +203
==========================================
+ Hits 35505 35586 +81
- Misses 12896 13009 +113
- Partials 2735 2744 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…etry Add TelemetryConfigRef field to MCPServerSpec that references a shared MCPTelemetryConfig resource, deprecating the inline Telemetry field. CEL validation rejects specs that set both fields simultaneously. The MCPServer controller resolves the reference during reconciliation, tracks config hash changes for drift detection, sets status conditions, generates OTEL environment variables (including sensitive header secret injection), and watches MCPTelemetryConfig for cascade reconciliation. The MCPTelemetryConfig controller now tracks referencing servers in status.referencingServers, blocks deletion while workloads reference the config, and watches MCPServer changes to keep the reference list current. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a13b896 to
1896ee0
Compare
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
Run `task operator-generate` and `task operator-manifests` to update the Helm-wrapped CRD template with the telemetryConfigRef field, CEL validation rule, and telemetryConfigHash status field. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review this PR focusing on critical/high priority issues and non-controversial correctness issues only. Specifically look for:
Ignore: nitpicks, naming/style preferences, import ordering, comment wording, controversial design choices, test coverage suggestions. Conclude with a clear APPROVE or REQUEST CHANGES verdict. |
|
Claude finished @ChrisJBurns's task in 4m 46s —— View job PR Review: Wire MCPServer to MCPTelemetryConfig
Findings1. TOCTOU: Deployment Built Without Telemetry Env Vars on Second Fetch Failure —
|
- getTelemetryConfigForMCPServer now returns (nil, nil) for NotFound so callers can distinguish "config missing" from "API unavailable" - handleTelemetryConfig uses ConditionReasonTelemetryConfigRefError for transient API errors, reserving NotFound for genuinely missing resources - Add ConditionReasonTelemetryConfigRefError constant - deploymentForMCPServer logs at visible level when transient fetch fails after prior validation, with clear message about degradation - Update tests for new (nil, nil) NotFound semantics Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review the latest push (commit ca3f9e0) which addresses your two findings:
Focus on critical/high issues and non-controversial correctness issues only. Ignore nitpicks, style, and controversial design choices. Conclude with APPROVE or REQUEST CHANGES. |
|
Claude finished @ChrisJBurns's task in 1m 48s —— View job PR Review: commit ca3f9e0 (follow-up fixes)
Verification of the Two FixesFix 1: TOCTOU / silent missing env vars
Assessment: The critical silent-failure path is fixed. The residual case (transient error hitting the narrow window after Fix 2: Wrong condition reason + dead nil check
Operators will now correctly distinguish "config doesn't exist" ( No new issues foundThe changes are minimal and targeted. No new correctness, security, or data-loss issues were introduced. VerdictAPPROVE Both findings are addressed correctly. The fixes are well-scoped and don't introduce new problems. |
Add envtest integration tests for the new telemetry config reference behavior: MCPTelemetryConfig controller (mcp-telemetry-config suite): - Reference tracking: ReferencingServers status updated when MCPServer references a config - Deletion protection: finalizer blocks deletion while referenced, allows deletion after reference is removed MCPServer CEL validation (mcp-server suite): - Mutual exclusion: reject when both telemetry and telemetryConfigRef are set - Accept telemetryConfigRef alone - Accept inline telemetry alone Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verify that MCPServer with telemetryConfigRef produces the same RunConfig telemetry fields as the existing inline telemetry path: - telemetryConfigRef with ServiceName override: endpoint, insecure, tracing, metrics, sampling rate, prometheus all match inline test - telemetryConfigRef without ServiceName: defaults to MCPServer name Also registers MCPTelemetryConfigReconciler in the mcp-server test suite so the referenced config gets reconciled (hash computed). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run `task crdref-gen` to update docs/operator/crd-api.md with the new telemetryConfigRef field, MCPTelemetryConfigReference type, telemetryConfigHash status field, and deprecation notice on the inline telemetry field. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
telemetryConfigReffield to MCPServerSpec that references an MCPTelemetryConfig in the same namespacetelemetryfield with CEL validation preventing both from being setstatus.referencingServersand blocks deletion while workloads reference the configRef #4253
Type of change
Test plan
task test)task lint-fix)Changes
api/v1alpha1/mcpserver_types.goTelemetryConfigRef,TelemetryConfigHash, condition constants, CEL mutual exclusion, deprecation commentcontrollers/mcpserver_telemetryconfig.gohandleTelemetryConfig,getTelemetryConfigForMCPServercontrollers/mcpserver_controller.gohandleTelemetryConfigin Reconcile, RBAC marker, env var generation for ref path, watch MCPTelemetryConfigcontrollers/mcpserver_runconfig.gocontrollers/mcptelemetryconfig_controller.gopkg/controllerutil/telemetry.goGenerateOpenTelemetryEnvVarsFromRefwith sensitive header secret injectionpkg/spectoconfig/telemetry.goNormalizeMCPTelemetryConfigfor ref-based conversionpkg/runconfig/telemetry.goAddMCPTelemetryConfigRefOptionsfor ref-based runner optionsDoes this introduce a user-facing change?
Yes. MCPServer now supports
spec.telemetryConfigRefto reference a shared MCPTelemetryConfig resource. The inlinespec.telemetryfield is deprecated. Example:Large PR Justification
Special notes for reviewers
MCPTelemetryConfigReference.ServiceNameoverride lets each MCPServer sharing a config still have distinct telemetry identityTOOLHIVE_OTEL_HEADER_*env vars on the proxy container — the proxy runner will need a follow-up to merge these into the OTLP exporter at startupGenerated with Claude Code