-
Notifications
You must be signed in to change notification settings - Fork 38.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement TopologyInfo and cpu_ids in podresources interface #93243
Implement TopologyInfo and cpu_ids in podresources interface #93243
Conversation
184ca3d
to
fe759e0
Compare
fe759e0
to
e99f7e0
Compare
// DevicesProvider knows how to provide the devices used by the given container | ||
type CPUsProvider interface { | ||
GetCPUs(podUID, containerName string) []uint32 | ||
UpdateAllocatedDevices() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should name this more appropriately, maybe UpdateAllocatedCPUs(). Also, where is this method implemented? I assumed that it would be implemented in cpu_manager.go but didn't find its implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, nice catch, I forget to remove it from here. In DeviceManager this interface is necessary to actualize podDevices state (GetDevice of DeviceManager return devices from podDevices), for CPUManager, where we return cpu directly - UpdateAllocatedDevices interface is not necessary as well as something like podDevices abstraction, since CPUManager uses CheckpointState for such purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds reasonable!
func (m *manager) GetCPUs(podUID, containerName string) []uint32 { | ||
cpus := m.state.GetCPUSetOrDefault(string(podUID), containerName) | ||
result := []uint32{} | ||
for cpu := range cpus.ToSliceNoSort() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should be corrected to : for _, cpu := range cpus.ToSliceNoSort()
The current implementation results in the index values being stored in the result slice (e.g. []int32{0,1,2,3,4....} )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly
df42f42
to
a2bdeb9
Compare
a2bdeb9
to
4256b47
Compare
4256b47
to
90adc73
Compare
/release-note-none |
90adc73
to
be8b7bc
Compare
/retest |
/milestone v1.20 |
/retest |
3565502
to
3d8c2dc
Compare
possible github issue, please link if you run into this on other PR's kubernetes/test-infra#19910 |
Yes, it's |
2560c69
to
f221e13
Compare
This change is necessary for supporting Topology in the ContainerDevices. Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
f221e13
to
93cb64b
Compare
@AlexeyPerevalov: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
PodDevices will have its own guard Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
It covers deviceplugin & cpumanager. It has drawback, since cpuset and all other structs including cadvisor's keep cpu as int, but for protobuf based interface is better to have fixed int. This patch also introduces additional interface CPUsProvider, while DeviceProvider might have been extended too. Checkpoint not covered by unit test. Signed-off-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
93cb64b
to
a8b8995
Compare
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a node_e2e test that invokes this in a follow-on?
/lgtm
/approve
@@ -393,11 +393,8 @@ func (m *ManagerImpl) Allocate(pod *v1.Pod, container *v1.Container) error { | |||
func (m *ManagerImpl) UpdatePluginResources(node *schedulerframework.NodeInfo, attrs *lifecycle.PodAdmitAttributes) error { | |||
pod := attrs.Pod | |||
|
|||
m.mutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just noting this appears resolved below
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: AlexeyPerevalov, derekwaynecarr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@derekwaynecarr I'm adding e2e tests for podresources in my PR which depends on this one (see 2bda2a7 and |
/kind api-change
This PR introduces additional information numaid and cpu_ids list (in cpuset list format ).
It implements KEP 1884
It's necessary for topology aware scheduling, this information will be used in topology exposing daemon.
To test it you need to enable
KubeletPodResources
feature gate in the command line option or in the KubeletConfiguration (by default it's in the /var/lib/kubelet/config.yaml file)Also you can use following sample to track down pod resources
Issue: kubernetes/enhancements#2043