Skip to content

chore: Add leader election for cluster metrics#42

Merged
jsingleton-dev merged 1 commit into
mainfrom
jsingleton/add-otel-leader-election
May 11, 2026
Merged

chore: Add leader election for cluster metrics#42
jsingleton-dev merged 1 commit into
mainfrom
jsingleton/add-otel-leader-election

Conversation

@jsingleton-dev
Copy link
Copy Markdown
Collaborator

@jsingleton-dev jsingleton-dev commented May 11, 2026

The OTeL collector runs with one pod per node. If we enable cluster metrics as is, each node would report the same metrics and we would have duplicate numbers. In order to avoid duplicates, we can use the leader election extension for OTeL. This will elect a single leader from the DaemonSet to report cluster metrics and will prevent duplicate reporting.

An alternative would be to make a separate single-replica deployment for just cluster metrics, but that would require a larger change and this leader election fits nicely into our existing charts.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added OpenTelemetry leader election support for Kubernetes cluster receivers.
  • Chores

    • Incremented chart versions to reflect configuration updates.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

📝 Walkthrough

Walkthrough

This PR adds Kubernetes leader election support to OpenTelemetry's Kubernetes cluster receiver. When the kubernetesCluster preset is enabled, the chart conditionally injects a k8s_leader_elector extension with serviceAccount auth and configures the receiver to use it. RBAC permissions for lease resources are granted, and comprehensive tests validate the feature across default and preset-enabled states.

Changes

OTEL Leader Election Support

Layer / File(s) Summary
Leader Election Extension Configuration
charts/ctrlplane/charts/otel/templates/_config.tpl
Adds conditional k8s_leader_elector extension under otel.extensions with serviceAccount auth, lease name, and namespace derived from chart fullname when kubernetesCluster preset is enabled.
Service Extensions Wiring
charts/ctrlplane/charts/otel/templates/_config.tpl
Registers k8s_leader_elector in otel.service.extensions list alongside health_check when the preset is active.
Kubernetes Cluster Receiver Configuration
charts/ctrlplane/charts/otel/templates/_receivers.tpl
Adds k8s_leader_elector: k8s_leader_elector field to the k8s_cluster receiver to wire the extension when preset is enabled.
RBAC Lease Permissions
charts/ctrlplane/charts/otel/templates/clusterrole.yaml
Extends ClusterRole with new rules granting get, list, watch, create, update, patch, delete verbs on coordination.k8s.io leases for leader election.
Chart Version Updates
charts/ctrlplane/Chart.yaml, charts/ctrlplane/charts/otel/Chart.yaml
Bumps ctrlplane chart version from 1.1.3 to 1.1.4; bumps otel subchart version from 0.2.2 to 0.2.3.
Leader Election Test Suite
charts/ctrlplane/tests/otel_leader_election_test.yaml
New test suite validates ClusterRole lease permissions, confirms leader elector absent by default, asserts preset-enabled config includes extension with correct auth/lease settings, verifies receiver references the elector, and confirms service extensions registration.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

  • zacharyblasczyk
  • jsbroks
  • adityachoudhari26

Poem

🐰 A leader hops forth in consensus divine,
Where Kubernetes leases in harmony align,
With tests to ensure the election is tight,
The chart takes its place in the cluster's sweet light!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'chore: Add leader election for cluster metrics' directly and clearly describes the main change: adding leader election functionality for cluster metrics reporting to prevent duplicates in the OTel DaemonSet configuration.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jsingleton/add-otel-leader-election

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
charts/ctrlplane/charts/otel/templates/_config.tpl (1)

62-64: 💤 Low value

Consider a more flexible regex pattern.

The regex on line 64 assumes a specific order and formatting for the extensions list. While this works for the current implementation, it could break if whitespace or ordering changes in the future.

More flexible alternative

You could use separate assertions to be more resilient:

- matchRegex:
    path: data.config
    pattern: 'service:[\s\S]*extensions:[\s\S]*- k8s_leader_elector'

This validates that k8s_leader_elector appears in the service extensions list without assuming specific ordering or adjacent items. However, the current approach is perfectly acceptable for Helm chart testing where template output is predictable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@charts/ctrlplane/charts/otel/templates/_config.tpl` around lines 62 - 64, The
current matchRegex pattern in the matchRegex block for path: data.config is too
strict about ordering/whitespace; update the pattern used by matchRegex (the
pattern property) to a more flexible regex that uses broader dotall-style
matching (e.g., using [\s\S]* between sections) so it asserts "service:" appears
before "extensions:" and that "k8s_leader_elector" occurs somewhere in the
extensions list without requiring exact adjacency or whitespace; locate the
matchRegex/pattern entry in the template and replace the rigid pattern with the
relaxed one while keeping path: data.config and the matchRegex key intact.
charts/ctrlplane/tests/otel_leader_election_test.yaml (1)

62-64: ⚡ Quick win

Make service.extensions assertion order-agnostic to reduce flaky tests.

This regex hard-codes extension order and formatting, so harmless reordering can break the test. Prefer asserting presence of both entries independently (or parse/assert list membership) instead of sequence.

Proposed test adjustment
-      - matchRegex:
-          path: data.config
-          pattern: 'extensions:\s*\n\s+- health_check\s*\n\s+- k8s_leader_elector'
+      - matchRegex:
+          path: data.config
+          pattern: 'extensions:[\s\S]*- health_check'
+      - matchRegex:
+          path: data.config
+          pattern: 'extensions:[\s\S]*- k8s_leader_elector'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@charts/ctrlplane/tests/otel_leader_election_test.yaml` around lines 62 - 64,
The test currently uses matchRegex on path "data.config" with a hard-coded
ordered pattern ('extensions: ... - health_check ... - k8s_leader_elector')
which is brittle; update the assertion for "service.extensions" to be
order-agnostic by replacing the single ordered regex with either two independent
presence checks (e.g., separate matchRegex/assertions for "- health_check" and
"- k8s_leader_elector" against data.config) or a single regex that uses
lookahead assertions to require both "- health_check" and "- k8s_leader_elector"
under the "extensions:" block so the test no longer depends on extension order.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@charts/ctrlplane/charts/otel/templates/_config.tpl`:
- Around line 62-64: The current matchRegex pattern in the matchRegex block for
path: data.config is too strict about ordering/whitespace; update the pattern
used by matchRegex (the pattern property) to a more flexible regex that uses
broader dotall-style matching (e.g., using [\s\S]* between sections) so it
asserts "service:" appears before "extensions:" and that "k8s_leader_elector"
occurs somewhere in the extensions list without requiring exact adjacency or
whitespace; locate the matchRegex/pattern entry in the template and replace the
rigid pattern with the relaxed one while keeping path: data.config and the
matchRegex key intact.

In `@charts/ctrlplane/tests/otel_leader_election_test.yaml`:
- Around line 62-64: The test currently uses matchRegex on path "data.config"
with a hard-coded ordered pattern ('extensions: ... - health_check ... -
k8s_leader_elector') which is brittle; update the assertion for
"service.extensions" to be order-agnostic by replacing the single ordered regex
with either two independent presence checks (e.g., separate
matchRegex/assertions for "- health_check" and "- k8s_leader_elector" against
data.config) or a single regex that uses lookahead assertions to require both "-
health_check" and "- k8s_leader_elector" under the "extensions:" block so the
test no longer depends on extension order.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f643a42a-0727-44d4-80c8-22359a84db95

📥 Commits

Reviewing files that changed from the base of the PR and between 8b19638 and 2dcc8a7.

⛔ Files ignored due to path filters (1)
  • charts/ctrlplane/Chart.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • charts/ctrlplane/Chart.yaml
  • charts/ctrlplane/charts/otel/Chart.yaml
  • charts/ctrlplane/charts/otel/templates/_config.tpl
  • charts/ctrlplane/charts/otel/templates/_receivers.tpl
  • charts/ctrlplane/charts/otel/templates/clusterrole.yaml
  • charts/ctrlplane/tests/otel_leader_election_test.yaml

@jsingleton-dev jsingleton-dev merged commit 11663e8 into main May 11, 2026
4 checks passed
@jsingleton-dev jsingleton-dev deleted the jsingleton/add-otel-leader-election branch May 11, 2026 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants