fix(networking): drop metrics port from external Service by bdchatham · Pull Request #120 · sei-protocol/sei-k8s-controller

bdchatham · 2026-04-22T19:38:47Z

Summary

Drops metrics:26660 from the external (public-facing) Service's port list, fixing a double-scrape of every SeiNode pod caused by the ServiceMonitor selector matching both the per-pod and external Services.

Context

Recommended by a K8s review of the duplicate-scrape observation:

Per-pod headless Service (<sndname>-<ordinal>) carries labels {sei.io/nodedeployment, sei.io/node, sei.io/revision} and exposes all ports (including metrics)
Aggregate external Service (<sndname>-external) carries only {sei.io/nodedeployment} but was generated with the same port list — including metrics
ServiceMonitor's selector.matchLabels: {sei.io/nodedeployment: ...} matches both → prometheus-operator targets both → Prometheus scrapes each pod twice

Observed live on a gprusak soak test: 3 RPC pods rendered as 6 series on tendermint_consensus_latest_block_height count panels.

The internal ClusterIP aggregate (<sndname>-internal) correctly omits the metrics port already, so this isn't a cross-cutting issue — only the external path was wrong.

Approach

Added externalPortsForMode(mode) that wraps portsForMode(mode) and filters out metrics. Tactical — a better home for this rule is sei-protocol/sei-config next to NodePortsForMode. Followed up with sei-config#7 to add a proper ExternalServicePorts helper; this PR's filter will call through to it once that lands (TODO comment on the helper).

Scope

internal/controller/nodedeployment/networking.go — new externalPortsForMode helper; generateExternalService uses it.
internal/controller/nodedeployment/networking_test.go — updated two existing expectations (AllPortsForFullMode now has 6 ports not 7; ValidatorModePorts is p2p only).
No CRD changes. No change to per-pod Service. No change to internal aggregate Service.

Test plan

go test ./internal/controller/nodedeployment/ passes.
After merge + image bump + Flux reconcile: kubectl -n gprusak describe svc v6-5-run-3-rpc-0-external shows no metrics port; count by(pod)(tendermint_consensus_latest_block_height{chain_id="v6-5-run-3"}) returns 1 per pod instead of 2.

Refs: sei-protocol/sei-config#7 (upstream helper), #116 / #118 (prior component / chain_id label work).

🤖 Generated with Claude Code

The external (public-facing) Service was carrying the same port list as the per-pod headless Service, including `metrics:26660`. ServiceMonitor selectors then matched both Services, so every pod was scraped twice — visible as 2x the expected series count on dashboards (e.g. 3 RPC pods rendering as 6 on count-based panels). Adds a tactical filter in the external-service generator that drops `metrics` from the port list. Per-pod Services remain scrape targets unchanged. Longer-term, the "which ports belong on an external Service" rule should live in sei-config next to the port definitions themselves; see sei-protocol/sei-config#7 for the upstream helper proposal. This PR's externalPortsForMode will call through to that helper once it lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bdchatham merged commit ab7faf0 into main Apr 22, 2026
2 checks passed

bdchatham mentioned this pull request Apr 22, 2026

chore: bump controller image to ab7faf0 (drop metrics port from external Service) #121

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(networking): drop metrics port from external Service#120

fix(networking): drop metrics port from external Service#120
bdchatham merged 1 commit intomainfrom
fix/servicemonitor-per-pod-selector

bdchatham commented Apr 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bdchatham commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Approach

Scope

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bdchatham commented Apr 22, 2026 •

edited

Loading