Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not consider unschedulable nodes when finding lowest TSC frequency #10181

Closed
iholder101 opened this issue Jul 27, 2023 · 0 comments · Fixed by #10182
Closed

Do not consider unschedulable nodes when finding lowest TSC frequency #10181

iholder101 opened this issue Jul 27, 2023 · 0 comments · Fixed by #10182
Labels

Comments

@iholder101
Copy link
Contributor

iholder101 commented Jul 27, 2023

Context:
TSC frequencies are needed by VMs in several use-cases. In some scenarios these frequencies are required to schedule the VM, and in others they're required in order to migrate the VM.

Since all nodes support different TSC frequencies, the lowest frequency on the cluster is being found and used. That way a VM can schedule / migrate to one of the following:

  1. Nodes that support this frequency directly
  2. Nodes that support higher frequencies, but support tsc-scaling which means they can also support lower frequencies.

What happened:
Consider the following scenario:

  1. A cluster is being set up in a way that master nodes are schedulable.
  2. The admin decides to turn them to be unschedulable. In this case, virt-handler pods would still run on these nodes.
  3. Since virt-handlers are still running on these nodes, they are considered as schedulable from Kubevirt's perspective

In the above scenario, the lowest TSC frequency can be found on an unschedulable node that still runs virt-handler. It's possible that no other nodes support this TSC frequency. In such a scenario, VMs that depend on TSC frequency wouldn't be able to schedule/migrate.

What you expected to happen:
The chosen lowest TSC frequency should be supported on at least one node that VMs can schedule/migrate to.

How to reproduce it (as minimally and precisely as possible):

  1. Create a cluster with master unschedulable nodes
  2. On these nodes, disable node-labeller. Then, override TSC labels to mock supporting a very low frequency (it needs to be the lowest frequency supported in the cluster).
  3. Make master nodes schedulable.
  4. Create a VM with invtsc CPU feature
  5. See that the VM cannot schedule

More Info:
for more info, please look at #10169 and specifically #10169 (comment).

Environment:

  • KubeVirt version (use virtctl version): Client Version: version.Info{GitVersion:"v0.37.0-rc.0.8689+1c845a9d5d793b-dirty", GitCommit:"1c845a9d5d793bd363c27e16c0e7b329cb85e30f", GitTreeState:"dirty", BuildDate:"2023-07-26T14:16:25Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{GitVersion:"v0.37.0-rc.0.8522+e25174f3d21ad9", GitCommit:"e25174f3d21ad9202d534cba14bcd4eeb36ca9d6", GitTreeState:"clean", BuildDate:"2023-07-25T13:09:11Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes version (use kubectl version): Client Version: v1.27.3
  • OS (e.g. from /etc/os-release): CentOS Stream 9
  • Kernel (e.g. uname -a): Linux node01 5.14.0-331.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jun 22 15:26:42 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
1 participant