Skip to content

OCPNODE-4487: replace worker --system-reserved flags with config drop-in#6044

Open
nispriha wants to merge 1 commit into
openshift:mainfrom
nispriha:njagan/auto-sizing-dropin
Open

OCPNODE-4487: replace worker --system-reserved flags with config drop-in#6044
nispriha wants to merge 1 commit into
openshift:mainfrom
nispriha:njagan/auto-sizing-dropin

Conversation

@nispriha
Copy link
Copy Markdown

@nispriha nispriha commented May 14, 2026

- What I did

Remove EnvironmentFile=/etc/node-sizing.env and the --system-reserved command-line flag from worker kubelet.service. The auto-sizing script (dynamic-system-reserved-calc.sh) now writes a KubeletConfiguration drop-in file to /etc/openshift/kubelet.conf.d/20-auto-sizing.conf, which kubelet reads via its existing --config-dir flag. This eliminates the round trip through the environment file for worker nodes. Master and arbiter nodes remain unchanged.

- How to verify it

  • go test ./pkg/controller/kubelet-config/... -v passes
  • e2e test TestKubeletAutoSizingDropIn verifies worker kubelet.service does not contain --system-reserved or EnvironmentFile=/etc/node-sizing.env
  • On a running cluster, verify /etc/openshift/kubelet.conf.d/20-auto-sizing.conf exists on worker nodes after boot

- Description for the changelog

Worker kubelet now receives systemReserved values via a KubeletConfiguration drop-in instead of --system-reserved command-line flags.

Summary by CodeRabbit

  • Refactor

    • Kubelet auto-sizing configuration now uses drop-in files instead of command-line flags for system reserved resources.
  • Tests

    • Added end-to-end test to validate kubelet auto-sizing configuration behavior.

Remove EnvironmentFile=/etc/node-sizing.env and the --system-reserved
command-line flag from worker kubelet.service. The auto-sizing script
now writes a KubeletConfiguration drop-in file to
/etc/openshift/kubelet.conf.d/20-auto-sizing.conf, which kubelet reads
via its existing --config-dir flag.

This eliminates the round trip through the environment file for worker
nodes. Master and arbiter nodes remain unchanged.
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 14, 2026

@nispriha: This pull request references OCPNODE-4487 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

- What I did

Remove EnvironmentFile=/etc/node-sizing.env and the --system-reserved command-line flag from worker kubelet.service. The auto-sizing script (dynamic-system-reserved-calc.sh) now writes a KubeletConfiguration drop-in file to /etc/openshift/kubelet.conf.d/20-auto-sizing.conf, which kubelet reads via its existing --config-dir flag. This eliminates the round trip through the environment file for worker nodes. Master and arbiter nodes remain unchanged.

- How to verify it

  • go test ./pkg/controller/kubelet-config/... -v passes
  • e2e test TestKubeletAutoSizingDropIn verifies worker kubelet.service does not contain --system-reserved or EnvironmentFile=/etc/node-sizing.env
  • On a running cluster, verify /etc/openshift/kubelet.conf.d/20-auto-sizing.conf exists on worker nodes after boot

- Description for the changelog

Worker kubelet now receives systemReserved values via a KubeletConfiguration drop-in instead of --system-reserved command-line flags.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 14, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: nispriha
Once this PR has been reviewed and has the lgtm label, please assign pablintino for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bfce4348-e471-46e9-8575-5d0e853bdac1

📥 Commits

Reviewing files that changed from the base of the PR and between 76610ef and 8c2d700.

📒 Files selected for processing (4)
  • templates/common/_base/files/kubelet-auto-sizing.yaml
  • templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml
  • templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
  • test/e2e-2of2/kubeletcfg_test.go
💤 Files with no reviewable changes (2)
  • templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml
  • templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml

Walkthrough

The PR refactors how kubelet system-reserved resources are configured. Instead of passing them as command-line arguments to the kubelet service, the configuration is now written to a drop-in file (/etc/openshift/kubelet.conf.d/20-auto-sizing.conf) by a new bash helper. Systemd units no longer reference the /etc/node-sizing.env file or include the --system-reserved flag.

Changes

Kubelet Auto-Sizing Configuration Migration

Layer / File(s) Summary
Auto-sizing script helper and integration
templates/common/_base/files/kubelet-auto-sizing.yaml
New write_kubelet_config_dropin helper function writes /etc/openshift/kubelet.conf.d/20-auto-sizing.conf with computed system reserved resources. Both dynamic_node_sizing and static_node_sizing paths now invoke this helper after calculating memory, CPU, and ephemeral-storage values. Variables are explicitly set before appending to environment.
Kubelet service units
templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml, templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
Both base and on-prem kubelet systemd unit templates remove the --system-reserved=cpu=...,memory=...,ephemeral-storage=... ExecStart argument. EnvironmentFile loading changes from /etc/node-sizing.env to kubernetes-specific kubelet env files (/etc/kubernetes/kubelet-workaround, /etc/kubernetes/kubelet-env).
E2E test validation
test/e2e-2of2/kubeletcfg_test.go
New TestKubeletAutoSizingDropIn test selects a random worker node and verifies the kubelet.service unit neither contains --system-reserved flags nor references EnvironmentFile=/etc/node-sizing.env.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Microshift Test Compatibility ⚠️ Warning New test TestKubeletAutoSizingDropIn uses ExecCmdOnNode which depends on Machine Config Daemon in openshift-machine-config-operator namespace, not available on MicroShift. Add [Skipped:MicroShift] label to test name or guard with IsMicroShiftCluster() check with skip. Refactor to avoid machine-config-operator namespace dependency.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically summarizes the main change: replacing worker --system-reserved flags with a config drop-in, directly matching the primary objective of the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed TestKubeletAutoSizingDropIn has a static, deterministic name with no dynamic values. Test file uses standard Go testing, not Ginkgo. Follows naming best practices.
Test Structure And Quality ✅ Passed The custom check is for Ginkgo tests (Describe/It blocks, BeforeEach/AfterEach). The added test uses standard Go testing (func TestXXX), not Ginkgo. This check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The test validates kubelet.service config on a single worker node. No multi-node assumptions. Compatible with SNO since SNO nodes have worker role labels.
Topology-Aware Scheduling Compatibility ✅ Passed No topology-incompatible scheduling constraints. Changes limited to kubelet config delivery (systemd templates, bash scripts). No pod affinity, replica counts, or topology-dependent rules added.
Ote Binary Stdout Contract ✅ Passed PR contains no process-level stdout writes. The new test function uses only test utilities and assertions. Templates are not subject to OTE Binary Stdout Contract. No violations detected.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Test has no IPv4 hardcoded addresses, IPv4-specific parsing, or external connectivity. Templates properly support IPv6 with conditional logic. IPv6/disconnected-network compatible.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nispriha
Copy link
Copy Markdown
Author

/retest

1 similar comment
@nispriha
Copy link
Copy Markdown
Author

/retest

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 15, 2026

@nispriha: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-openstack 8c2d700 link false /test e2e-openstack

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@haircommander
Copy link
Copy Markdown
Member

is the control plane unchanged just because we're anxious about breaking something? I feel it'd be better to set consistently TBH

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants