Skip to content

Conversation

@vladikr
Copy link

@vladikr vladikr commented Jan 24, 2020

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:

/kind api-change
/kind bug
/kind cleanup
/kind deprecation
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:
#83592 has introduced an option to set --reserved-cpus to explicitly define the CPUs list which kubelet will reserve for system use. This reserved CPUs will not be used by the cpu manager for an exclusive assignment for containers with guaranteed QOS.

However, when this option is used with an isolcpus kernel option on a node, all pods with a non-guaranteed QOS will only be able to execute on the CPUs defined by "—reserved-cpus", since CPUs defined by the isolcpus parameter will be removed from the kernel SMP balancing and scheduling.

In case the number of reserved-cpus is low, there will be a low amount of CPUs available for non-guaranteed pods to run on. On the other hand, reserved-cpus will reduce the Capacity reported by this node and will create a scheduling problem.

This PR introduces a new kubelet argument --static-cpus and StaticCPUs config option.
This is in order to restrict the list of CPUs that will be used for an exclusive assignment for containers with guaranteed QOS to only the specified list.
With this parameter, the amount of reserved-cpus can stay low and still provide more CPUs for non-guaranteed pods without reducing the node Capacity

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

A new kubelet argument `--static-cpus` and `StaticCPUs` config option to restrict the list of CPUs that will be used for an exclusive assignment for containers with guaranteed QOS to only the specified list.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jan 24, 2020
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 24, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @vladikr!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @vladikr. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 24, 2020
@vladikr vladikr changed the title staticCPUs kubelet option to specify a restricted list of CPUs for exclusive assignment [WIP] staticCPUs kubelet option to specify a restricted list of CPUs for exclusive assignment Jan 24, 2020
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/kubelet kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 24, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: vladikr
To complete the pull request process, please assign dchen1107, erictune
You can assign the PR to them by writing /assign @dchen1107 @erictune in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jan 25, 2020
Copy link

@davidvossel davidvossel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach seems reasonable to me.

From my perspective, this is the least invasive way for us to be able to express the ability to assign guaranteed qos containers to a CPU range making use of isolcpus kernel settings. This would be a situation where we need to guarantee certain CPUs are ineligible for system calls, and we want those CPUs to run high performance low latency workloads

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably worth adding some error checking here to validate the cpu range falls within a range actually available on the host, and that the range doesn't overlap with something like "reserved-cpu" or any other conflicting options.

…ve assignment

Signed-off-by: Vladik Romanovsky <vromanso@redhat.com>
This option will restricts the list of cpus that will be used for exclusing assignment

Signed-off-by: Vladik Romanovsky <vromanso@redhat.com>
… cm static policy

Signed-off-by: Vladik Romanovsky <vromanso@redhat.com>
Signed-off-by: Vladik Romanovsky <vromanso@redhat.com>
…icCPUs

Signed-off-by: Vladik Romanovsky <vromanso@redhat.com>
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 28, 2020
@vladikr
Copy link
Author

vladikr commented Jan 28, 2020

/cc @derekwaynecarr

@rphillips
Copy link
Member

cc @kubernetes/sig-node-pr-reviews

@rphillips
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 28, 2020
@vpickard
Copy link
Contributor

/cc @ConnorDoyle

@fejta-bot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

Copy link
Member

@derekwaynecarr derekwaynecarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am finding this slightly confusing, but its a complicated domain.

Can you help me understand how a typical node administrator would configure all the specified values together to meet the desired effect? It seems like this would ultimately restrict the number of non-integral core pods that could be placed on the node since the scheduler would not know the number of pods that appear to implicitly being reserved for integral requests.

fs.Int32Var(&c.PodsPerCore, "pods-per-core", c.PodsPerCore, "Number of Pods per core that can run on this Kubelet. The total number of Pods on this Kubelet cannot exceed max-pods, so max-pods will be used if this calculation results in a larger number of Pods allowed on the Kubelet. A value of 0 disables this limit.")
fs.BoolVar(&c.ProtectKernelDefaults, "protect-kernel-defaults", c.ProtectKernelDefaults, "Default kubelet behaviour for kernel tuning. If set, kubelet errors if any of kernel tunables is different than kubelet defaults.")
fs.StringVar(&c.ReservedSystemCPUs, "reserved-cpus", c.ReservedSystemCPUs, "A comma-separated list of CPUs or CPU ranges that are reserved for system and kubernetes usage. This specific list will supersede cpu counts in --system-reserved and --kube-reserved.")
fs.StringVar(&c.StaticCPUs, "static-cpus", c.StaticCPUs, "A comma-separated list of CPUs or CPU ranges that will be used exclusive assignemnt. This list restricts the list of CPUs available for exclusive assignemnt. Containers with guaranteed QOS will be assigned with CPUs only from this list.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to understand this proposal better.

Today, if I use either --system-reserved/kube-reserved OR reserved-cpus, I am reducing my node allocatable capacity. If used in concert with cpu manager policy of static, the specified or low-numbered cores are excluded from exclusive scheduling BUT for non G QoS pods.

If I use static-cpus, I am not reducing allocatable, but I am implicitly restricting the number of G versus non G QoS pods I can run on a node without any corresponding signal back to the scheduler. Is this an accurate understanding of the proposal? The net is a kubelet could report allocatable cpu=6, but if static-cpus=1-4, its really allocatable=6 where 4 can be used by G pods, and 2 are left for non G pods. This seems confusing, but its possible I am reading this wrong.

If you could show a concrete configuration of isolcpu, reserved-cpus, and static-cpus used in concert it would help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to phrase this, prior to this static-cpu proposal the list of CPU ranges that could be handed out for exclusive assignment dynamically adjusted with the pods that were scheduled to the node. This makes that dynamism go away, and restricts where non G pods are able to execute and fit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @derekwaynecarr

I am trying to understand this proposal better.

Today, if I use either --system-reserved/kube-reserved OR reserved-cpus, I am reducing my node allocatable capacity. If used in concert with cpu manager policy of static, the specified or low-numbered cores are excluded from exclusive scheduling BUT for non G QoS pods.

This is true from k8s perspective, however, in case isolcpus kernel option is configured, it will prevent process running in these non G QoS pods to execute on the isolated cpus.
If there are total 6 cpus, and we will configure --reserved-cpus=0-1, isolcpus=2-5, non G pods will be able to execute only on 0-1

If I use static-cpus, I am not reducing allocatable, but I am implicitly restricting the number of G versus non G QoS pods I can run on a node without any corresponding signal back to the scheduler. Is this an accurate understanding of the proposal? The net is a kubelet could report allocatable cpu=6, but if static-cpus=1-4, its really allocatable=6 where 4 can be used by G pods, and 2 are left for non G pods. This seems confusing, but its possible I am reading this wrong.
Yes, that's correct, however, static-cpus only applies to G QoS pods, the same way as reserved-cpus only applies to G QoS pods.

If you could show a concrete configuration of isolcpu, reserved-cpus, and static-cpus used in concert it would help.

I think there will be no need to configure reserved-cpus with the new static-cpus option.
If we have cpus=6, (kernel) isolcpus=2-5, static-cpus=2-5

You are right about the scheduler. I don't know how to solve this. I didn't find a way to account for only G CPUs with the static-cpus option. However, I thought that this will mimic the behavior of the Affinity Manager, which will error if it couldn't align pod CPUs to the correct numa node.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to phrase this, prior to this static-cpu proposal the list of CPU ranges that could be handed out for exclusive assignment dynamically adjusted with the pods that were scheduled to the node. This makes that dynamism go away, and restricts where non G pods are able to execute and fit.

I think the non G pods are not affected by this change, from k8s perspective. At least I didn't intend to change this behavior

As for the dynamism, you're right, but it is also gone when kernel isolcpus option is used in a combination with reserved-cpus.

My take on this, and it may be wrong, is that those who configure isolcpus on the node know that they are eventually limiting their G pods to only this list of CPUs. However, without the static-cpus option they didn't have a way to effectively restrict the pool of the G CPUs to only isolcpus without reducing the nodes allocatable capacity.

Copy link
Contributor

@jianzzha jianzzha Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: can the static cpu set ever be used by the non G pod?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another possible issue: say you have 6 allocable cpu on the system but static-cpu=4-5; a pod which request 4 static cpu can be schedule to this node; but the 4>2 and kubelet will fail to accommodate this pod. Right?

@derekwaynecarr
Copy link
Member

I would like to also get @klueska feedback on this.

/assign @klueska

@atheurer
Copy link

atheurer commented Jan 28, 2020

The need for this feature is mostly driven by the effects that isolcpus kernel tuning parameter. Isolcpus, for now, is the only real way to get very, very low interruption of a cpu, but it also comes at the cost of disabling load-balancing. So, if one were to use isolcpus today, the outcome could be very undesirable, if, for example, --reserved-cpus and kernel's isolcpus do not not consume all of the CPUs on a host. Let start with some examples:

Scenario-1:
A host has cpus 0-15, and there is a desire to run both guaranteed and non-guaranteed pods. We assume (and for some this may not be true) that the guaranteed pods will self-assign their threads to CPUs they are allowed to use. On this system, the --reserved-cpus is set to 0-1, and isolcpus is set to 9-15, leaving CPUs 2-8 neither in the reserved pool or the isolcpus.

A pod is deployed with guaranteed policy (it asks for 7 cpus and it wants very low latency/interruption), it happens to get placed on cpus 2-8. Another pod is deployed, but is not guaranteed (it wants 7 cpus, it is fine with some interruption, but expects load-balancing), and it can run on CPUs 0-1,9-15 (anywhere a guaranteed pod is not running). The outcome here is that the guaranteed pod did not get the very-low latency/interruption CPUs, and the non-guaranteed pod, while seemingly getting lots of CPUs, can really only run on CPUs 0-1 because load-balancing is disabled on CPUs 9-15. So in this case everyone is disappointed :)

Scenario-2:
Same type of host, but to avoid the "hole" in CPU tuning and kubelet CPU management, we set --reserved-cpus to 0-8 and isolcpus to 9-15. Deploying the guaranteed pod works fine: we get the isolated CPUs because the large --reserved-cpus prevents it from using any others. However, when attempting to deploy the non-guaranteed pod it fails to be scheduled on this node because the CPU capacity of this node is now much lower (due to large --reserved-cpus).

Scenario-3:
Same type of host, but this time we keep --reserved-cpus small, (0-1), so the CPU capacity of this node is not so low, and we tune isolcpus to be 2-15. We deploy the guaranteed pod, and it runs on cpus 2-8. We then deploy the non-guaranteed pod, and has access to CPUs 0-1,9-15, but it can really only run on CPUs 0-1 because load-balancing is disabled on all CPUs but 0-1.

All of these scenarios have issues. With a new feature --static-cpus, we can do the following:

Scenario-4:
Same type of host, and this time --reserved-cpus is 0-1 and isolcpus is 9-15, and we have that "hole" in CPUs 2-8, but we add --static-cpus=9-15 to kubelet. When the guaranteed pod is deployed, it uses CPUs 9-15. When the non-guaranteed pod is deployed, it uses CPUs 0-8 and it can actually use all of those because they all load-balance. Everyone is happy.

I should note that eventually, when we can get the kernel to have all CPUs except the ones used for --reserved-cpus to be very-low latency/interruption (no need for isolcpus kernel param), it would not be necessary to use --static-cpus, unless you had a very specific use-case, like specifying only all the CPUs on 1 NUMA node, so you could be confident the guaranteed pod got exclusive use of a L3 cache.

@k8s-ci-robot
Copy link
Contributor

@vladikr: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-verify 6cd4aa2 link /test pull-kubernetes-verify
pull-kubernetes-e2e-gce 6cd4aa2 link /test pull-kubernetes-e2e-gce
pull-kubernetes-kubemark-e2e-gce-big 6cd4aa2 link /test pull-kubernetes-kubemark-e2e-gce-big

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@ipuustin ipuustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is another approach to the CPU pool problem, which is about dividing the available CPUs in the system to various groups with some semantics. The real issue is the workload placement: how to get the right pods running on the right CPU pools?

I submitted a proposal about a pretty similar problem some time ago: kubernetes/community#2739. There the idea was to split the CPUs using QoS classes, so that we could have the right allocatable number of CPUs for the node and get "best effort" workloads running on separate cores, and I seem to remember that isolcpus was also part of the solution space. Also @Levovar has a KEP proposal about making CPU Manager respect the isolcpus setting: kubernetes/community#2435.

var staticCPUs cpuset.CPUSet
var errParse error

if s.StaticCPUs != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is that --static-cpus is consumed by Static CPU Manager policy, right? Should there rather be a generic way for configuring the CPU Manager policies? For example being able to assign a map of configuration parameters to them, or add a configuration file next to CPU Manager state file?

reservedCPUsFloat := float64(reservedCPUs.MilliValue()) / 1000
numReservedCPUs := int(math.Ceil(reservedCPUsFloat))
policy, err = NewStaticPolicy(topo, numReservedCPUs, specificCPUs, affinity)
policy, err = NewStaticPolicy(topo, numReservedCPUs, specificCPUs, staticCPUs, affinity)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a pending PR (#87522) which also adds a new parameter to static policy creation indicating the available CPUs (using a cpuset cgroup file)̣. These seem to be related approaches.

@klueska
Copy link
Contributor

klueska commented Feb 2, 2020

It's on my TODO list to review this week.

I would like to also get @klueska feedback on this.

/assign @klueska

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 26, 2020
@k8s-ci-robot
Copy link
Contributor

@vladikr: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 26, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 25, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@shiya0705
Copy link
Contributor

shiya0705 commented Jul 23, 2025

The need for this feature is mostly driven by the effects that isolcpus kernel tuning parameter. Isolcpus, for now, is the only real way to get very, very low interruption of a cpu, but it also comes at the cost of disabling load-balancing. So, if one were to use isolcpus today, the outcome could be very undesirable, if, for example, --reserved-cpus and kernel's isolcpus do not not consume all of the CPUs on a host. Let start with some examples:

Scenario-1: A host has cpus 0-15, and there is a desire to run both guaranteed and non-guaranteed pods. We assume (and for some this may not be true) that the guaranteed pods will self-assign their threads to CPUs they are allowed to use. On this system, the --reserved-cpus is set to 0-1, and isolcpus is set to 9-15, leaving CPUs 2-8 neither in the reserved pool or the isolcpus.

A pod is deployed with guaranteed policy (it asks for 7 cpus and it wants very low latency/interruption), it happens to get placed on cpus 2-8. Another pod is deployed, but is not guaranteed (it wants 7 cpus, it is fine with some interruption, but expects load-balancing), and it can run on CPUs 0-1,9-15 (anywhere a guaranteed pod is not running). The outcome here is that the guaranteed pod did not get the very-low latency/interruption CPUs, and the non-guaranteed pod, while seemingly getting lots of CPUs, can really only run on CPUs 0-1 because load-balancing is disabled on CPUs 9-15. So in this case everyone is disappointed :)

Scenario-2: Same type of host, but to avoid the "hole" in CPU tuning and kubelet CPU management, we set --reserved-cpus to 0-8 and isolcpus to 9-15. Deploying the guaranteed pod works fine: we get the isolated CPUs because the large --reserved-cpus prevents it from using any others. However, when attempting to deploy the non-guaranteed pod it fails to be scheduled on this node because the CPU capacity of this node is now much lower (due to large --reserved-cpus).

Scenario-3: Same type of host, but this time we keep --reserved-cpus small, (0-1), so the CPU capacity of this node is not so low, and we tune isolcpus to be 2-15. We deploy the guaranteed pod, and it runs on cpus 2-8. We then deploy the non-guaranteed pod, and has access to CPUs 0-1,9-15, but it can really only run on CPUs 0-1 because load-balancing is disabled on all CPUs but 0-1.

All of these scenarios have issues. With a new feature --static-cpus, we can do the following:

Scenario-4: Same type of host, and this time --reserved-cpus is 0-1 and isolcpus is 9-15, and we have that "hole" in CPUs 2-8, but we add --static-cpus=9-15 to kubelet. When the guaranteed pod is deployed, it uses CPUs 9-15. When the non-guaranteed pod is deployed, it uses CPUs 0-8 and it can actually use all of those because they all load-balance. Everyone is happy.

I should note that eventually, when we can get the kernel to have all CPUs except the ones used for --reserved-cpus to be very-low latency/interruption (no need for isolcpus kernel param), it would not be necessary to use --static-cpus, unless you had a very specific use-case, like specifying only all the CPUs on 1 NUMA node, so you could be confident the guaranteed pod got exclusive use of a L3 cache.

@vladikr @davidvossel @derekwaynecarr @ipuustin @atheurer

This is a PR from long time ago, but I am very interested in this topic, the problem really happened with Scenario 1 to 3,
but PR is not merged finally. After reviewing the PR discussion and comments I can not get why it is pended.
Could you kindly share any insights into why the static cpulist solution wasn't pursued further? I'd appreciate understanding the blockers, such as technical concerns, design trade-offs, or alignment with Kubernetes priorities.
If there is other solution in current K8s to solve this issue(Scenario 1~3)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.