New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-30569: Systemd processes not being moved to cpuset/systemd.slice fix #992
OCPBUGS-30569: Systemd processes not being moved to cpuset/systemd.slice fix #992
Conversation
@rbaturov: This pull request references Jira Issue OCPBUGS-30569, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (mniranja@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/hold |
b4c5e0d
to
a04faf5
Compare
/retest-required |
pkg/performanceprofile/controller/performanceprofile/components/machineconfig/machineconfig.go
Outdated
Show resolved
Hide resolved
general approach LGTM one question but it doesn't necessarily need to be addressed |
/unhold |
ee2a78a
to
0e6ef59
Compare
test/e2e/performanceprofile/functests/1_performance/performance.go
Outdated
Show resolved
Hide resolved
dd90277
to
deffc0c
Compare
/retest-required |
/hold
|
deffc0c
to
1c0a391
Compare
@rbaturov: This pull request references Jira Issue OCPBUGS-30569, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (mniranja@redhat.com), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retest-required |
The script cpuset-configure.sh is responsible to move the systemd processes to the cpuset/systemd.slice cgroup and is executed in a form of a service (cpuset-configure.service). In the current implementation, the script is executed too early - some system processes are yet to be created. This in turn leads to them not being moved to the custom system slice. Moreover, in the current implementation, the script is executed before the network-online.target. The intention was to execute the script before kubelet and crio services are initialized (by the fact network-online.target is a common parent) in order to make sure that no workload pods are starting before we are making this transition. The fix I'm proposing consist of the following changes: 1. Adding an After statements - The script will start once crio service is initialized, due to the fact it's initialized in the very end of the boot process, just a bit before kubelet. Thereby we can ensure late starting processes do not fall between the cracks. 2. Narrowing down the Before statement to a more accurate one, reflecting its original intention. (Running the script before kubelet only would be enough guarantee no workload pods are started at that time). Signed-off-by: Ronny Baturov <rbaturov@redhat.com>
When we are using cgroups v1 we are counting on the cpuset-configure.service to move all the system services to the custom system.slice. This test ensures the service indeed moved them. It is also a good practice to check for similar errors on cgroup v2 systems. Signed-off-by: Ronny Baturov <rbaturov@redhat.com>
1c0a391
to
809bebf
Compare
/retest-required |
/unhold |
/lgtm @MarSik PTAL for another review |
@yanirq: GitHub didn't allow me to request PR reviews from the following users: PTAL, for, another, review. Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/approve the test seems ok - as simple as can be, and the dependency logic seems to be the sweet spot |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ffromani, rbaturov The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@rbaturov: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@rbaturov: Jira Issue OCPBUGS-30569: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-30569 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/cherry-pick release-4.15 |
@rbaturov: #992 failed to apply on top of branch "release-4.15":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…/systemd.slice fix This is a manual backport for openshift#992 * Systemd processes not being moved to cpuset/systemd.slice fix The script cpuset-configure.sh is responsible to move the systemd processes to the cpuset/systemd.slice cgroup and is executed in a form of a service (cpuset-configure.service). In the current implementation, the script is executed too early - some system processes are yet to be created. This in turn leads to them not being moved to the custom system slice. Moreover, in the current implementation, the script is executed before the network-online.target. The intention was to execute the script before kubelet and crio services are initialized (by the fact network-online.target is a common parent) in order to make sure that no workload pods are starting before we are making this transition. The fix I'm proposing consist of the following changes: 1. Adding an After statements - The script will start once crio service is initialized, due to the fact it's initialized in the very end of the boot process, just a bit before kubelet. Thereby we can ensure late starting processes do not fall between the cracks. 2. Narrowing down the Before statement to a more accurate one, reflecting its original intention. (Running the script before kubelet only would be enough guarantee no workload pods are started at that time). Signed-off-by: Ronny Baturov <rbaturov@redhat.com> * Added a test to verify system processes are in the correct cgroup When we are using cgroups v1 we are counting on the cpuset-configure.service to move all the system services to the custom system.slice. This test ensures the service indeed moved them. It is also a good practice to check for similar errors on cgroup v2 systems. Signed-off-by: Ronny Baturov <rbaturov@redhat.com> --------- Signed-off-by: Ronny Baturov <rbaturov@redhat.com>
This is a manual backport for openshift#992 * Systemd processes not being moved to cpuset/systemd.slice fix The script cpuset-configure.sh is responsible to move the systemd processes to the cpuset/systemd.slice cgroup and is executed in a form of a service (cpuset-configure.service). In the current implementation, the script is executed too early - some system processes are yet to be created. This in turn leads to them not being moved to the custom system slice. Moreover, in the current implementation, the script is executed before the network-online.target. The intention was to execute the script before kubelet and crio services are initialized (by the fact network-online.target is a common parent) in order to make sure that no workload pods are starting before we are making this transition. The fix I'm proposing consist of the following changes: 1. Adding an After statements - The script will start once crio service is initialized, due to the fact it's initialized in the very end of the boot process, just a bit before kubelet. Thereby we can ensure late starting processes do not fall between the cracks. 2. Narrowing down the Before statement to a more accurate one, reflecting its original intention. (Running the script before kubelet only would be enough guarantee no workload pods are started at that time). Signed-off-by: Ronny Baturov <rbaturov@redhat.com> * Added a test to verify system processes are in the correct cgroup When we are using cgroups v1 we are counting on the cpuset-configure.service to move all the system services to the custom system.slice. This test ensures the service indeed moved them. It is also a good practice to check for similar errors on cgroup v2 systems. Signed-off-by: Ronny Baturov <rbaturov@redhat.com> --------- Signed-off-by: Ronny Baturov <rbaturov@redhat.com>
This is a manual backport for openshift#992 * Systemd processes not being moved to cpuset/systemd.slice fix The script cpuset-configure.sh is responsible to move the systemd processes to the cpuset/systemd.slice cgroup and is executed in a form of a service (cpuset-configure.service). In the current implementation, the script is executed too early - some system processes are yet to be created. This in turn leads to them not being moved to the custom system slice. Moreover, in the current implementation, the script is executed before the network-online.target. The intention was to execute the script before kubelet and crio services are initialized (by the fact network-online.target is a common parent) in order to make sure that no workload pods are starting before we are making this transition. The fix I'm proposing consist of the following changes: 1. Adding an After statements - The script will start once crio service is initialized, due to the fact it's initialized in the very end of the boot process, just a bit before kubelet. Thereby we can ensure late starting processes do not fall between the cracks. 2. Narrowing down the Before statement to a more accurate one, reflecting its original intention. (Running the script before kubelet only would be enough guarantee no workload pods are started at that time). Signed-off-by: Ronny Baturov <rbaturov@redhat.com> * Added a test to verify system processes are in the correct cgroup When we are using cgroups v1 we are counting on the cpuset-configure.service to move all the system services to the custom system.slice. This test ensures the service indeed moved them. It is also a good practice to check for similar errors on cgroup v2 systems. Signed-off-by: Ronny Baturov <rbaturov@redhat.com> --------- Signed-off-by: Ronny Baturov <rbaturov@redhat.com>
This is a manual backport for openshift#992 * Systemd processes not being moved to cpuset/systemd.slice fix The script cpuset-configure.sh is responsible to move the systemd processes to the cpuset/systemd.slice cgroup and is executed in a form of a service (cpuset-configure.service). In the current implementation, the script is executed too early - some system processes are yet to be created. This in turn leads to them not being moved to the custom system slice. Moreover, in the current implementation, the script is executed before the network-online.target. The intention was to execute the script before kubelet and crio services are initialized (by the fact network-online.target is a common parent) in order to make sure that no workload pods are starting before we are making this transition. The fix I'm proposing consist of the following changes: 1. Adding an After statements - The script will start once crio service is initialized, due to the fact it's initialized in the very end of the boot process, just a bit before kubelet. Thereby we can ensure late starting processes do not fall between the cracks. 2. Narrowing down the Before statement to a more accurate one, reflecting its original intention. (Running the script before kubelet only would be enough guarantee no workload pods are started at that time). Signed-off-by: Ronny Baturov <rbaturov@redhat.com> * Added a test to verify system processes are in the correct cgroup When we are using cgroups v1 we are counting on the cpuset-configure.service to move all the system services to the custom system.slice. This test ensures the service indeed moved them. It is also a good practice to check for similar errors on cgroup v2 systems. Signed-off-by: Ronny Baturov <rbaturov@redhat.com> --------- Signed-off-by: Ronny Baturov <rbaturov@redhat.com>
[ART PR BUILD NOTIFIER] This PR has been included in build cluster-node-tuning-operator-container-v4.16.0-202404031748.p0.g89b3e39.assembly.stream.el9 for distgit cluster-node-tuning-operator. |
This is a manual backport for openshift#992 * Systemd processes not being moved to cpuset/systemd.slice fix The script cpuset-configure.sh is responsible to move the systemd processes to the cpuset/systemd.slice cgroup and is executed in a form of a service (cpuset-configure.service). In the current implementation, the script is executed too early - some system processes are yet to be created. This in turn leads to them not being moved to the custom system slice. Moreover, in the current implementation, the script is executed before the network-online.target. The intention was to execute the script before kubelet and crio services are initialized (by the fact network-online.target is a common parent) in order to make sure that no workload pods are starting before we are making this transition. The fix I'm proposing consist of the following changes: 1. Adding an After statements - The script will start once crio service is initialized, due to the fact it's initialized in the very end of the boot process, just a bit before kubelet. Thereby we can ensure late starting processes do not fall between the cracks. 2. Narrowing down the Before statement to a more accurate one, reflecting its original intention. (Running the script before kubelet only would be enough guarantee no workload pods are started at that time). Signed-off-by: Ronny Baturov <rbaturov@redhat.com> * Added a test to verify system processes are in the correct cgroup When we are using cgroups v1 we are counting on the cpuset-configure.service to move all the system services to the custom system.slice. This test ensures the service indeed moved them. It is also a good practice to check for similar errors on cgroup v2 systems. Signed-off-by: Ronny Baturov <rbaturov@redhat.com> --------- Signed-off-by: Ronny Baturov <rbaturov@redhat.com>
This is a manual backport for #992 * Systemd processes not being moved to cpuset/systemd.slice fix The script cpuset-configure.sh is responsible to move the systemd processes to the cpuset/systemd.slice cgroup and is executed in a form of a service (cpuset-configure.service). In the current implementation, the script is executed too early - some system processes are yet to be created. This in turn leads to them not being moved to the custom system slice. Moreover, in the current implementation, the script is executed before the network-online.target. The intention was to execute the script before kubelet and crio services are initialized (by the fact network-online.target is a common parent) in order to make sure that no workload pods are starting before we are making this transition. The fix I'm proposing consist of the following changes: 1. Adding an After statements - The script will start once crio service is initialized, due to the fact it's initialized in the very end of the boot process, just a bit before kubelet. Thereby we can ensure late starting processes do not fall between the cracks. 2. Narrowing down the Before statement to a more accurate one, reflecting its original intention. (Running the script before kubelet only would be enough guarantee no workload pods are started at that time). * Added a test to verify system processes are in the correct cgroup When we are using cgroups v1 we are counting on the cpuset-configure.service to move all the system services to the custom system.slice. This test ensures the service indeed moved them. It is also a good practice to check for similar errors on cgroup v2 systems. --------- Signed-off-by: Ronny Baturov <rbaturov@redhat.com>
The script cpuset-configure.sh is responsible to move the
systemd processes to the cpuset/systemd.slice cgroup and is executed
in a form of a service (cpuset-configure.service).
In the current implementation, the script is executed too early - some
system processes are yet to be created.
This in turn leads to them not being moved to the custom system slice.
Moreover, in the current implementation, the script is executed before
the network-online.target. The intention was to execute the script before
kubelet and crio services are initialized (by the fact network-online.target
is a common parent) in order to make sure that no workload pods are starting
before we are making this transition.
The fix I'm proposing consist of the following changes:
due to the fact it's initialized in the very end of the boot process,
just a bit before kubelet.
Thereby we can ensure late starting processes do not fall between the cracks.
one, reflecting its original intention. (Running the script before kubelet
only would be enough guarantee no workload pods are started at that time).