Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evict pod if the device is removed #61

Open
tnyeanderson opened this issue Dec 21, 2023 · 2 comments
Open

Evict pod if the device is removed #61

tnyeanderson opened this issue Dec 21, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@tnyeanderson
Copy link

I hope that I'm just missing something simple here. I have configured a USB device in the generic-device-plugin, and I'm able to ensure that a certain pod will only be scheduled on nodeA which has that USB device plugged in by setting resource limits. So far: AWESOME!

I can unplug the USB device from nodeA and plug it into nodeB, and each node's .status.capacity and .status.allocatable are updated to reflect which node has the device. PERFECT!

The problem that I have is that if the pod has already been scheduled and is running before I move the USB to nodeB, the pod will remain on the node which no longer has the device available. I was hoping that the scheduler would recognize that the node no longer has the resources to support the pod, evict it, and eventually reschedule it on nodeB once it's available. But this doesn't happen according to my testing.

I've thought of a few possible workarounds (involving labels and affinity rules), but I wanted to see if there's any existing ideas/solutions.

@squat
Copy link
Owner

squat commented Dec 26, 2023

Hi @tnyeanderson, eviction functionality is not part of the device plugin today! That would be a very cool feature to consider adding.

If you're interested in contributing such functionality, I would happily review and merge. This is probably the best place to look for inspiration: https://github.com/kubernetes-sigs/descheduler.

Designing this component could be somewhat tricky. The component needs to keep track of the identity of every device on every node and who it's been allocated to. If the pod tracking this information dies, then it would lose track of what pods have been allocated what devices. We'd need to make this information persistent. Thinking quickly, one way to accomplish this would be for the plugin to annotate all pods that receive a device with the device's ID and then to evict pods matching an annotation for a device that's disappeared from a node.

@squat squat added the enhancement New feature or request label Dec 26, 2023
@tnyeanderson
Copy link
Author

Thinking quickly, one way to accomplish this would be for the plugin to annotate all pods that receive a device with the device's ID and then to evict pods matching an annotation for a device that's disappeared from a node.

Sounds brilliant to me! I'll be away for the holidays, and the new year at work (and at home) tends to be a little busy, but I'll try to hit this sometime in January and get it to you for review.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants