Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPVE-379: fix: avoid checking resources for BestEffort pods #28006

Conversation

eggfoobar
Copy link
Contributor

@eggfoobar eggfoobar commented Jun 27, 2023

The reason the flakes have been occurring is due to some tests around the network-operator creating DaemonSets and Deployments that do not contain resources. Adding a skip list for resources that are okay being BestEffort and added comment to code to be explicit on the meaning of best effort in a workload partitioned cluster. This does not effect the CPU affinity for workload partitioning.

Test-Grid Flaks: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.14-informing#periodic-ci-openshift-release-master-nightly-4.14-e2e-aws-ovn-cpu-partitioning&include-filter-by-regex=CPU%20Partitioning

Ref Resources on the network operator:
DaemonSet
Deployment

/assign @deads2k

@eggfoobar eggfoobar changed the title fix: avoid checking resources for BestEffort pods OCPVE-379: fix: avoid checking resources for BestEffort pods Jun 27, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jun 27, 2023

@eggfoobar: This pull request references OCPVE-379 which is a valid jira issue.

In response to this:

The reason the flakes have been occurring is due to some tests around the network-operator creating DaemonSets and Deployments that do not contain resources. Since BestEffort is something we don't alter during the admission webhook I think it makes sense to skip BestEffort pods for resource checks. This does not effect the CPU affinity for workload partitioning, it will just set default cpushare time.

Test-Grid Flaks: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.14-informing#periodic-ci-openshift-release-master-nightly-4.14-e2e-aws-ovn-cpu-partitioning&include-filter-by-regex=CPU%20Partitioning

Ref Resources on the network operator:
DaemonSet
Deployment

/assign @deads2k

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 27, 2023
@openshift-ci openshift-ci bot requested review from csrwng and mfojtik June 27, 2023 17:57
@eggfoobar eggfoobar force-pushed the fix-cpu-partitioning-flakes branch from c062f0f to 6359b46 Compare June 27, 2023 19:10
@deads2k
Copy link
Contributor

deads2k commented Jun 27, 2023

The point of cpu affinity is to keep the platform from taking cpu from the workloads. Our platform pods, even best effort ones, need to be controlled for that purpose

@eggfoobar
Copy link
Contributor Author

Oh sorry @deads2k, I wasn't clear what's happening, the desired CPU affinity is not being violated here. There are two things that happen for workload partitioning, a CPU affinity and a CPU share change.

The flake was around the CPU Share itself. Since we checked that pods containing the workload partitioning annotations also had their resource requests altered to the custom resource field, this would flake on the tests that spun up BestEffort deployments and daemonsets because the default behavior is to apply the workload annotation with the minimum 2 CPU shares and not add/change resource requests since none are present at admission time.

Unless we should require that all platform pods must contain resource requests, this would be safe to skip since the CPU affinity is still respected, and the workload get's a default CPUShare value of 2. What do you think?

@eggfoobar
Copy link
Contributor Author

eggfoobar commented Jun 29, 2023

Adding some more context around CPU shares. A cpu share is a relative amount of time a process is granted on a CPU, in Kubernetes 1CPU = 1024CPU shares.

When there is no CPU time contention, nothing happens and the process is allowed to use more, however if there is CPU time contention, then the cpu share will be enforced and processes with a higher CPU share will get that proportional percentage of time on the CPU. i.e two process with a 512 cpu share will each get 50% CPU time.

In Kubernetes when a pod has a QoS of BestEffort, the kubelet will automatically give it a default CPU share of 2, meaning it will be scheduled pretty much anywhere since it's so small and if there's contention it will get the smallest amount of time on the CPU. If you need a minimum amount, consider supplying a resource.requests for your pod.

Normally this is not too much of a problem. However, when using the workload pinning feature ( which allows cluster admins to limit the number of CPUs the platform pods have access to ), the probability of CPU time contention goes up and resource.requests become more important for platform pods. It is strongly recommended that for everything but your least important pods you should supply adequate resource.requests, otherwise you might be in scenarios where your pod might be starved for CPU time with the default 2 CPU Shares. In other words under a highly utilized CPU in a CPU pinned cluster would mean your pod would only get 2/1024 * 100 = ~0.19% CPU time.

@eggfoobar eggfoobar force-pushed the fix-cpu-partitioning-flakes branch from 6359b46 to 748ea47 Compare July 5, 2023 14:31
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jul 5, 2023

@eggfoobar: This pull request references OCPVE-379 which is a valid jira issue.

In response to this:

The reason the flakes have been occurring is due to some tests around the network-operator creating DaemonSets and Deployments that do not contain resources. Adding a skip list for resources that are okay being BestEffort and added comment to code to be explicit on the meaning of best effort in a workload partitioned cluster. This does not effect the CPU affinity for workload partitioning.

Test-Grid Flaks: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.14-informing#periodic-ci-openshift-release-master-nightly-4.14-e2e-aws-ovn-cpu-partitioning&include-filter-by-regex=CPU%20Partitioning

Ref Resources on the network operator:
DaemonSet
Deployment

/assign @deads2k

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@eggfoobar eggfoobar force-pushed the fix-cpu-partitioning-flakes branch from 748ea47 to c503d91 Compare July 5, 2023 14:43
Signed-off-by: ehila <ehila@redhat.com>

feat: update to use exclude list instead of blanket ignore on qos

Signed-off-by: ehila <ehila@redhat.com>
@eggfoobar eggfoobar force-pushed the fix-cpu-partitioning-flakes branch from c503d91 to 0057332 Compare July 5, 2023 15:08
@deads2k
Copy link
Contributor

deads2k commented Jul 5, 2023

the networking team ack'd these. The CPU slice is extremely small and bugs were opened so there's long term tracking as well.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 5, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 5, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, eggfoobar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 5, 2023
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 861d0fd and 2 for PR HEAD 0057332 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 30b9ae2 and 1 for PR HEAD 0057332 in total

@eggfoobar
Copy link
Contributor Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 6, 2023

@eggfoobar: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-single-node-serial 0057332 link false /test e2e-aws-ovn-single-node-serial
ci/prow/e2e-openstack-ovn 0057332 link false /test e2e-openstack-ovn
ci/prow/e2e-aws-ovn-single-node-upgrade 0057332 link false /test e2e-aws-ovn-single-node-upgrade
ci/prow/e2e-aws-ovn-cgroupsv2 0057332 link false /test e2e-aws-ovn-cgroupsv2
ci/prow/e2e-metal-ipi-ovn-ipv6 0057332 link false /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 21f0791 into openshift:master Jul 6, 2023
17 of 22 checks passed
@eggfoobar eggfoobar deleted the fix-cpu-partitioning-flakes branch August 8, 2023 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants