Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Prometheus]: Remove ComponentExceedsRequestedCPU alert #6977

Conversation

iholder101
Copy link
Contributor

What this PR does / why we need it:
This PR removes KubeVirtComponentExceedsRequestedCPU alert.

The runbook for this alert says: If this alert consistently fires this could mean that the node’s CPU resources are not being optimally used and could be overloaded. This is a mistake.

In Kubernetes, it's completely normal to use more vCPU than whatever defined in request section. Not only that - in order to avoid wasted CPU time, the CPU "leftovers" will be distributed between containers proportionately to their CPU request amount.

Example: let's imagine that two containers are running on a node. Container A requests 0.5 CPUs, container B requests 0.3 CPUs, and the node has 1 CPU (for the simplicity of the example). In this situation it isn't wise to waste the remaining 20% CPU time that is left unused. Instead every container would get the remained 20% proportionate to the requested amount of its CPU usage.

Therefore, an alert is not needed to be triggered on such cases which are absolutely normal and expected.

Release note:

NONE

@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/M area/monitoring labels Dec 20, 2021
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign rmohr after the PR has been reviewed.
You can assign the PR to them by writing /assign @rmohr in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@iholder101
Copy link
Contributor Author

/cc @sradco

@iholder101
Copy link
Contributor Author

/retest
/cc @Barakmor1

@Barakmor1
Copy link
Member

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Dec 21, 2021
@xpivarc
Copy link
Member

xpivarc commented Dec 21, 2021

I may not be aware of the motivation behind this but I don't think removing this alert is the right way to go.

The cited part in the PR description is impact. I agree that we can improve here at least by mentioning the more important issue. The issue I am seeing is the possible suboptimal performance of our components. Furthermore, we should evaluate the time period to eliminate sudden spikes.

I think keeping this alert is important at least for clusters with bigger scales. Maybe this is something for sig-scale to look into.

@Barakmor1
Copy link
Member

I may not be aware of the motivation behind this but I don't think removing this alert is the right way to go.

The cited part in the PR description is impact. I agree that we can improve here at least by mentioning the more important issue. The issue I am seeing is the possible suboptimal performance of our components. Furthermore, we should evaluate the time period to eliminate sudden spikes.

I think keeping this alert is important at least for clusters with bigger scales. Maybe this is something for sig-scale to look into.

I think that the cpu usage is determined by the system for instance:
if one container (lets call it container A) request for 50% of the cpu of the node there are two cases to consider:

  • There are more containers that fill the remaining cpu gap (the total cpu request of all the containers is 100%), in that case container A will use 50% of cpu and the scheduling system will make sure of that (just like processes in OS).

  • There is free cpu in the node. For instance, if container A runs alone in the node, container A will use 100% of the cpu since it would be a waste of resources to make the node idle (it doesn't make sense to use 50% of the CPU if no other container will ever use it..)

therefore i don't think that container that exceeds his cpu request is something that should trigger an alert.

@iholder101
Copy link
Contributor Author

I agree with @Barakmor1.
The most simple example is a node with 2 CPUs running one container that requests for 1 CPU. Will the second CPU be idle? This doesn't make sense and therefore Kubernetes will let the container use the 2 CPUs as long as no other container requests it. In such a scenario this alert would fire constantly although everything is completely normal and operates as expected.

@xpivarc Please let me know if you agree. I'd be happy to provide further information.

@iholder101
Copy link
Contributor Author

After speaking with @xpivarc offline - we both agree that the current state of the alert is completely wrong and needs to be changed. I'm not sure what was the intent behind this alert to begin with, but one possibility is that it aimed to give an indication about having a wrong cpu "request" amount. While this alert does not serve this goal in any way we can maybe think of other ways of serving it.

(We thought to trigger an alert when two things happen at the same time: VMI that exceeds its request now stops exceeding it and performance degregation. This might indicate of the actual CPU needed for the container is higher than requested).

While I agree that this is an important goal I don't think we should keep a bad alert simply because we don't have a good one yet. IMHO this alert only makes things worse and not better. It is confusing and fires on perfectly normal situations, the runbook for it is completely wrong and I can't think of a single situation that this alert would benefit anyone.

@iholder101
Copy link
Contributor Author

/cc @vladikr FYI

@kubevirt-bot
Copy link
Contributor

@iholder-redhat: GitHub didn't allow me to request PR reviews from the following users: FYI.

Note that only kubevirt members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @vladikr FYI

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 22, 2021
@iholder101 iholder101 force-pushed the prometheus/removeAlertForCpuOverRequested branch from 3ce3628 to d75f123 Compare December 22, 2021 14:38
@kubevirt-bot kubevirt-bot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Dec 22, 2021
@iholder101
Copy link
Contributor Author

Rebased.
Can you re-lgtm? @Barakmor1

@Barakmor1
Copy link
Member

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Dec 22, 2021
@vladikr
Copy link
Member

vladikr commented Dec 22, 2021

/hold
@iholder-redhat this alert doesn't target VMIs, but this is about monitoring the CPU consumption of the KubeVirt components (virt-api, virt-handler, virt-controller, etc... ).
I think the intent here is to collect information and be able to adjust our defaults.
I think this is useful to have.

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 22, 2021
@iholder101
Copy link
Contributor Author

/hold @iholder-redhat this alert doesn't target VMIs, but this is about monitoring the CPU consumption of the KubeVirt components (virt-api, virt-handler, virt-controller, etc... ). I think the intent here is to collect information and be able to adjust our defaults. I think this is useful to have.

@vladikr
Can you give me an example of a situation this might be helpful?

Again - if there's a node without any VMIs - only virt-controller, virt-api, etc'. Also, let the node have a lot of extra CPU (in other words - all of the containers on the node request 50% of the total CPU amount on the node). In this situation all of our components will exceed CPU requested by them simply because there is free CPU that the node is willing to spare.

@enp0s3
Copy link
Contributor

enp0s3 commented Jan 2, 2022

@iholder-redhat Hi. I think one useful case is already seen by us when we tried to debug multiple VM creation on a large scale cluster. In that case the virt-api deployment was highly utilized, as you said it was given only 5m in the resource requests. If that alert would pop up then we could spot high utilization on the virt-api deployment and we would increase the replica set.

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 2, 2022
@iholder101
Copy link
Contributor Author

@iholder-redhat Hi. I think one useful case is already seen by us when we tried to debug multiple VM creation on a large scale cluster. In that case the virt-api deployment was highly utilized, as you said it was given only 5m in the resource requests. If that alert would pop up then we could spot high utilization on the virt-api deployment and we would increase the replica set.

As I've explained in detail in previous comments, there is absolutely no correlation between the pod's workload and the amount of CPU is uses.

The only situation in which the Pod will exceed its CPU requests is if the node has spare CPU that it does not use. Therefore this gives us no indication of a problem and will only confuse the cluster admin doing more harm than good.

I think one useful case is already seen by us when we tried to debug multiple VM creation on a large scale cluster.

This is confusing, but it's not true. Again, we would have seen this alert if the node had enough spare CPU, it doesn't have anything to do with the workload itself.

Signed-off-by: Itamar Holder <iholder@redhat.com>
Signed-off-by: Itamar Holder <iholder@redhat.com>
@iholder101 iholder101 force-pushed the prometheus/removeAlertForCpuOverRequested branch from d75f123 to ef1a04d Compare January 3, 2022 09:19
@kubevirt-bot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@kubevirt-bot kubevirt-bot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 3, 2022
@iholder101
Copy link
Contributor Author

/retest

@kubevirt-bot
Copy link
Contributor

@iholder-redhat: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-e2e-k8s-1.20-sig-compute-nonroot ef1a04d link true /test pull-kubevirt-e2e-k8s-1.20-sig-compute-nonroot
pull-kubevirt-e2e-k8s-1.21-sig-compute ef1a04d link true /test pull-kubevirt-e2e-k8s-1.21-sig-compute

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@vladikr
Copy link
Member

vladikr commented Jan 5, 2022

Can you give me an example of a situation this might be helpful?

I think this was needed for #6144
For example.

@vladikr
Copy link
Member

vladikr commented Jan 5, 2022

Again - if there's a node without any VMIs - only virt-controller, virt-api, etc'. Also, let the node have a lot of extra CPU (in other words - all of the containers on the node request 50% of the total CPU amount on the node). In this situation all of our components will exceed CPU requested by them simply because there is free CPU that the node is willing to spare.

if all components request 50% of the total capacity, then yes Some components will be throttled. The idea is to identify how much CPU time each of the components needs to avoid throttling.
And this is the point of this alert, afaiu.

@xpivarc
Copy link
Member

xpivarc commented Jan 5, 2022

After speaking with @xpivarc offline - we both agree that the current state of the alert is completely wrong and needs to be changed. I'm not sure what was the intent behind this alert to begin with, but one possibility is that it aimed to give an indication about having a wrong cpu "request" amount. While this alert does not serve this goal in any way we can maybe think of other ways of serving it.

(We thought to trigger an alert when two things happen at the same time: VMI that exceeds its request now stops exceeding it and performance degregation. This might indicate of the actual CPU needed for the container is higher than requested).

While I agree that this is an important goal I don't think we should keep a bad alert simply because we don't have a good one yet. IMHO this alert only makes things worse and not better. It is confusing and fires on perfectly normal situations, the runbook for it is completely wrong and I can't think of a single situation that this alert would benefit anyone.

I only remember agreeing that runbook is wrong. E.g Check to see what the cpu resource limit is. should talk about requests.
The impact is also not the most important. Rather we should mention the performance degradation of components.

I am definitely againts removing the alert at least without any replacement.

Again - if there's a node without any VMIs - only virt-controller, virt-api, etc'. Also, let the node have a lot of extra CPU (in other words - all of the containers on the node request 50% of the total CPU amount on the node). In this situation all of our components will exceed CPU requested by them simply because there is free CPU that the node is willing to spare.

Let's say our component needs 1.5 cpu in a particular cluster and we request only 1. In this case, we are not guaranteed to get the 0.5 additional cpu for our component and the performance can degrade.

It is completely normal for Pod to use more cpu than requested but in this case, we don't want it to happen for our components for large(mostly indefinite time). Therefore I suggested raising the evaluation time.

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 5, 2022
@iholder101
Copy link
Contributor Author

/close

@kubevirt-bot
Copy link
Contributor

@iholder-redhat: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@iholder101
Copy link
Contributor Author

Evidence for why this alert is misleading: #6144.

It's obvious that when pressuring the host this alert would fire and it's completely normal, but it makes it look like something is wrong. I still think it either has to be removed or renamed and explained properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/monitoring dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. release-note-none Denotes a PR that doesn't merit a release note. size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants