-
Notifications
You must be signed in to change notification settings - Fork 38.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Topology Manager Scope (container | pod) Feature #92967
Conversation
Welcome @cezaryzukowski! |
Hi @cezaryzukowski. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
/sig node |
97d9b81
to
c237f8f
Compare
/cc @klueska |
cmd/kubelet/app/options/options.go
Outdated
@@ -530,6 +530,8 @@ func AddKubeletConfigFlags(mainfs *pflag.FlagSet, c *kubeletconfig.KubeletConfig | |||
fs.Int32Var(&c.PodsPerCore, "pods-per-core", c.PodsPerCore, "Number of Pods per core that can run on this Kubelet. The total number of Pods on this Kubelet cannot exceed max-pods, so max-pods will be used if this calculation results in a larger number of Pods allowed on the Kubelet. A value of 0 disables this limit.") | |||
fs.BoolVar(&c.ProtectKernelDefaults, "protect-kernel-defaults", c.ProtectKernelDefaults, "Default kubelet behaviour for kernel tuning. If set, kubelet errors if any of kernel tunables is different than kubelet defaults.") | |||
fs.StringVar(&c.ReservedSystemCPUs, "reserved-cpus", c.ReservedSystemCPUs, "A comma-separated list of CPUs or CPU ranges that are reserved for system and kubernetes usage. This specific list will supersede cpu counts in --system-reserved and --kube-reserved.") | |||
fs.StringVar(&c.TopologyManagerScope, "topology-manager-scope", c.TopologyManagerScope, "Topology Manager Scope represents the scope of topology hint generation that topology manager requests and hint providers generates. Possible values: 'container', 'pod'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mention default value:
fs.StringVar(&c.TopologyManagerScope, "topology-manager-scope", c.TopologyManagerScope, "Topology Manager Scope represents the scope of topology hint generation that topology manager requests and hint providers generates. Possible values: 'container', 'pod'.") | |
fs.StringVar(&c.TopologyManagerScope, "topology-manager-scope", c.TopologyManagerScope, "Topology Manager Scope represents the scope of topology hint generation that topology manager requests and hint providers generates. Possible values: 'container' (default), 'pod'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cmd/kubelet/app/options/options.go
Outdated
@@ -530,6 +530,8 @@ func AddKubeletConfigFlags(mainfs *pflag.FlagSet, c *kubeletconfig.KubeletConfig | |||
fs.Int32Var(&c.PodsPerCore, "pods-per-core", c.PodsPerCore, "Number of Pods per core that can run on this Kubelet. The total number of Pods on this Kubelet cannot exceed max-pods, so max-pods will be used if this calculation results in a larger number of Pods allowed on the Kubelet. A value of 0 disables this limit.") | |||
fs.BoolVar(&c.ProtectKernelDefaults, "protect-kernel-defaults", c.ProtectKernelDefaults, "Default kubelet behaviour for kernel tuning. If set, kubelet errors if any of kernel tunables is different than kubelet defaults.") | |||
fs.StringVar(&c.ReservedSystemCPUs, "reserved-cpus", c.ReservedSystemCPUs, "A comma-separated list of CPUs or CPU ranges that are reserved for system and kubernetes usage. This specific list will supersede cpu counts in --system-reserved and --kube-reserved.") | |||
fs.StringVar(&c.TopologyManagerScope, "topology-manager-scope", c.TopologyManagerScope, "Topology Manager Scope represents the scope of topology hint generation that topology manager requests and hint providers generates. Possible values: 'container', 'pod'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this sentence is really hard to read and parse. Try to rephrase. Maybe like this:
fs.StringVar(&c.TopologyManagerScope, "topology-manager-scope", c.TopologyManagerScope, "Topology Manager Scope represents the scope of topology hint generation that topology manager requests and hint providers generates. Possible values: 'container', 'pod'.") | |
fs.StringVar(&c.TopologyManagerScope, "topology-manager-scope", c.TopologyManagerScope, "Scope to which topology hints applied. Topology Manager collects hints from Hint Providers and applies them to defined scope to ensure the pod admission. Possible values: 'container' (default), 'pod'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/kubelet/apis/config/types.go
Outdated
@@ -67,6 +67,12 @@ const ( | |||
// SingleNumaNodeTopologyManager Policy iis a mode in which kubelet only allows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated to this PR, but mentioning anyway:
// SingleNumaNodeTopologyManager Policy iis a mode in which kubelet only allows | |
// SingleNumaNodeTopologyManager Policy is a mode in which kubelet only allows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -117,6 +117,9 @@ func ValidateKubeletConfiguration(kc *kubeletconfig.KubeletConfiguration) error | |||
if kc.TopologyManagerPolicy != kubeletconfig.NoneTopologyManagerPolicy && !localFeatureGate.Enabled(features.TopologyManager) { | |||
allErrors = append(allErrors, fmt.Errorf("invalid configuration: TopologyManager %v requires feature gate TopologyManager", kc.TopologyManagerPolicy)) | |||
} | |||
if kc.TopologyManagerScope != kubeletconfig.ContainerScopeTopology && !localFeatureGate.Enabled(features.TopologyManager) { | |||
allErrors = append(allErrors, fmt.Errorf("invalid configuration: TopologyScope %v requires feature gate TopologyManager", kc.TopologyManagerScope)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allErrors = append(allErrors, fmt.Errorf("invalid configuration: TopologyScope %v requires feature gate TopologyManager", kc.TopologyManagerScope)) | |
allErrors = append(allErrors, fmt.Errorf("invalid configuration: TopologyManagerScope %v requires feature gate TopologyManager", kc.TopologyManagerScope)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/kubelet/apis/config/types.go
Outdated
@@ -221,6 +227,12 @@ type KubeletConfiguration struct { | |||
// TopologyManagerPolicy is the name of the policy to use. | |||
// Policies other than "none" require the TopologyManager feature gate to be enabled. | |||
TopologyManagerPolicy string | |||
// Topology Manager Scope represents the scope of topology hint generation | |||
// that topology manager requests and hint providers generates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// that topology manager requests and hint providers generates. | |
// that topology manager requests and hint providers generate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -77,6 +77,11 @@ type Manager interface { | |||
// and is consulted to achieve NUMA aware resource alignment among this | |||
// and other resource controllers. | |||
GetTopologyHints(*v1.Pod, *v1.Container) map[string][]topologymanager.TopologyHint | |||
|
|||
// GetTopologyHints implements the topologymanager. HintProvider Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name in comment should match the name of the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// GetTopologyHints implements the topologymanager. HintProvider Interface | |
// GetPodLevelTopologyHints implements the topologymanager.HintProvider Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed for GetPodTopologyHints
pkg/kubelet/cm/cpumanager/policy.go
Outdated
@@ -34,4 +34,8 @@ type Policy interface { | |||
// and is consulted to achieve NUMA aware resource alignment among this | |||
// and other resource controllers. | |||
GetTopologyHints(s state.State, pod *v1.Pod, container *v1.Container) map[string][]topologymanager.TopologyHint | |||
// GetTopologyHints implements the topologymanager. HintProvider Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name should match
// GetTopologyHints implements the topologymanager. HintProvider Interface | |
// GetPodLevelTopologyHints implements the topologymanager.HintProvider Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed for GetPodTopologyHints
@@ -78,6 +78,12 @@ func (m *ManagerImpl) GetTopologyHints(pod *v1.Pod, container *v1.Container) map | |||
return deviceHints | |||
} | |||
|
|||
// GetPodLevelTopologyHints implements the TopologyManager HintProvider Interface which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// GetPodLevelTopologyHints implements the TopologyManager HintProvider Interface which | |
// GetPodLevelTopologyHints implements the topologymanager.HintProvider Interface which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -37,6 +37,17 @@ const ( | |||
// present on a machine and the TopologyManager is enabled, an error will | |||
// be returned and the TopologyManager will not be loaded. | |||
maxAllowableNUMANodes = 8 | |||
// containerScopeTopology specifies the TopologyManagerScope per container. | |||
containerScopeTopology = "container" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it scopetopology, not topologyscope?
containerScopeTopology = "container" | |
containerTopologyScope = "container" |
I'd also question dropping the Manager
from the name. Why change terminology? Is the shorter name better here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
* Add podDevices() func * Add getPodDeviceRequest() func Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
* Extract common tests cases that will be used for both GetTopologyHints() and GetPodTopologyHints() * Extract machineInfo as it will be used for both functions as well Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
* Add tests for getPodRequestedCPU() Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
Pod object is more flexible to use and construct * Update TestGetTopologyHints() to work according to new test cases * Update topologyHintTestCase{} to include proper field Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
* Add additional test cases returned by getPodScopeTestCases() Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
A suite of e2e tests was created for Topology Manager so as to test pod scope alignment feature. Co-authored-by: Pawel Rapacz <p.rapacz@partner.samsung.com> Co-authored-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com> Signed-off-by: Cezary Zukowski <c.zukowski@samsung.com>
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cezaryzukowski, derekwaynecarr, klueska, smarterclayton The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This comment has been minimized.
This comment has been minimized.
/test pull-kubernetes-e2e-kind-ipv6 |
/lgtm |
@cezaryzukowski: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/test pull-kubernetes-e2e-gce-ubuntu-containerd |
What type of PR is this?
/kind feature
What this PR does / why we need it:
To enable pod-level resource affinity (details in the update to the KEP).
Special notes for your reviewer:
Add support for pod-level affinity next to the existing container-level affinity. Add the "--topology-manager-scope" flag to the kubelet binary and the "topologyManagerScope" field in the kubelet v1beta1 configuration. The value for both can be either "pod" or "container" (defaults to "container").
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
[KEP]: kubernetes/enhancements#1752