feat: add KubevirtMigrationAware evictor plugin#591
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tiraboschi The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
b76f84a to
b97edf7
Compare
Adds a new EvictorPlugin that makes the descheduler aware of KubeVirt
live-migration state when deciding whether to evict virt-launcher pods.
Filter (hard block): prevents eviction of pods whose VMI has a migration
in progress (startTimestamp set, endTimestamp absent in migrationState).
KubeVirt's own admission webhook provides a complementary safety net at
the API layer; this plugin acts upstream of it to avoid the round-trip.
PreEvictionFilter (soft block): defers eviction of pods whose VMI
completed a migration recently, using a three-layer adaptive cooldown:
1. Base: max(migrationCooldown, migrationDuration) — heavier VMs
(longer migrations) automatically receive longer protection.
2. Backoff: base × 2^(count−1) where count is the number of migration
completions recorded in a configurable sliding history window
(default 24h). Each successive migration within the window doubles
the cooldown, making repeated churn progressively harder.
3. Cap: the result is bounded by maxMigrationCooldown (default 6h) to
prevent pathological cases from locking a VM indefinitely.
Defaults: migrationCooldown=15m, maxMigrationCooldown=6h,
migrationHistoryWindow=24h. All three are operator-configurable.
Both extension points read from a dedicated dynamic VMI informer cache
(kubevirt.io/v1 VirtualMachineInstances), avoiding API-server calls in
the hot eviction path. An UpdateFunc event handler on the same informer
records migration completions by VMI UID to drive the backoff history.
The cache warms up at startup with a 30s timeout; failure to sync is a
hard error so the descheduler does not start with stale or empty state.
Two Prometheus metrics are registered on first use:
- descheduler_kubevirt_eviction_blocks_total{reason,node,namespace}
counter — tracks eviction blocks for alerting and per-node diagnosis.
- descheduler_kubevirt_effective_cooldown_seconds histogram — shows the
distribution of applied cooldown durations across backoff buckets
(15m, 30m, 1h, 2h, 4h, 6h) so operators can tell whether the backoff
is engaging or VMs are piling up at the cap.
All code paths that cannot retrieve or parse VMI state fail open
(allow eviction) so the plugin never blocks unrelated workloads.
Unit tests cover Filter, PreEvictionFilter (base cooldown, adaptive
duration, exponential backoff, maxMigrationCooldown cap), migration
history recording and pruning, informer event handler, defaults, and
validation. No kubevirt imports are required: VMI state is expressed as
plain *unstructured.Unstructured objects, exactly as the dynamic informer
delivers them at runtime.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
b97edf7 to
a07081e
Compare
|
@tiraboschi: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Enable the KubevirtMigrationAware plugin as filter and pre-eviction filter in the KubeVirtRelieveAndMigrate profile. The filter hard-blocks eviction of VMs with an in-progress migration; the pre-eviction filter applies an adaptive cooldown after migration completes to prevent cascading re-evictions. Introduce three new ProfileCustomizations fields to tune the plugin: - devMigrationCooldown: base cooldown after a migration completes - devMaxMigrationCooldown: upper bound on the exponential backoff - devMigrationHistoryWindow: sliding window for counting past migrations Requires: openshift/descheduler#591 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Enable the KubevirtMigrationAware plugin as filter and pre-eviction filter in the KubeVirtRelieveAndMigrate profile. The filter hard-blocks eviction of VMs with an in-progress migration; the pre-eviction filter applies an adaptive cooldown after migration completes to prevent cascading re-evictions. Introduce three new ProfileCustomizations fields to tune the plugin: - devMigrationCooldown: base cooldown after a migration completes - devMaxMigrationCooldown: upper bound on the exponential backoff - devMigrationHistoryWindow: sliding window for counting past migrations Requires: openshift/descheduler#591 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Enable the KubevirtMigrationAware plugin as filter and pre-eviction filter in the KubeVirtRelieveAndMigrate profile. The filter hard-blocks eviction of VMs with an in-progress migration; the pre-eviction filter applies an adaptive cooldown after migration completes to prevent cascading re-evictions. Introduce three new ProfileCustomizations fields to tune the plugin: - devMigrationCooldown: base cooldown after a migration completes - devMaxMigrationCooldown: upper bound on the exponential backoff - devMigrationHistoryWindow: sliding window for counting past migrations Requires: openshift/descheduler#591 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Enable the KubevirtMigrationAware plugin as filter and pre-eviction filter in the KubeVirtRelieveAndMigrate profile. The filter hard-blocks eviction of VMs with an in-progress migration; the pre-eviction filter applies an adaptive cooldown after migration completes to prevent cascading re-evictions. Introduce three new ProfileCustomizations fields to tune the plugin: - devMigrationCooldown: base cooldown after a migration completes - devMaxMigrationCooldown: upper bound on the exponential backoff - devMigrationHistoryWindow: sliding window for counting past migrations Requires: openshift/descheduler#591 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Enable the KubevirtMigrationAware plugin as filter and pre-eviction filter in the KubeVirtRelieveAndMigrate profile. The filter hard-blocks eviction of VMs with an in-progress migration; the pre-eviction filter applies an adaptive cooldown after migration completes to prevent cascading re-evictions. Introduce three new ProfileCustomizations fields to tune the plugin: - devMigrationCooldown: base cooldown after a migration completes - devMaxMigrationCooldown: upper bound on the exponential backoff - devMigrationHistoryWindow: sliding window for counting past migrations Requires: openshift/descheduler#591 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Enable the KubevirtMigrationAware plugin as filter and pre-eviction filter in the KubeVirtRelieveAndMigrate profile. The filter hard-blocks eviction of VMs with an in-progress migration; the pre-eviction filter applies an adaptive cooldown after migration completes to prevent cascading re-evictions. Introduce three new ProfileCustomizations fields to tune the plugin: - devMigrationCooldown: base cooldown after a migration completes - devMaxMigrationCooldown: upper bound on the exponential backoff - devMigrationHistoryWindow: sliding window for counting past migrations Requires: openshift/descheduler#591 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Enable the KubevirtMigrationAware plugin as filter and pre-eviction filter in the KubeVirtRelieveAndMigrate profile. The filter hard-blocks eviction of VMs with an in-progress migration; the pre-eviction filter applies an adaptive cooldown after migration completes to prevent cascading re-evictions. Introduce three new ProfileCustomizations fields to tune the plugin: - devMigrationCooldown: base cooldown after a migration completes - devMaxMigrationCooldown: upper bound on the exponential backoff - devMigrationHistoryWindow: sliding window for counting past migrations Requires: openshift/descheduler#591 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Enable the KubevirtMigrationAware plugin as filter and pre-eviction filter in the KubeVirtRelieveAndMigrate profile. The filter hard-blocks eviction of VMs with an in-progress migration; the pre-eviction filter applies an adaptive cooldown after migration completes to prevent cascading re-evictions. Introduce three new ProfileCustomizations fields to tune the plugin: - devMigrationCooldown: base cooldown after a migration completes - devMaxMigrationCooldown: upper bound on the exponential backoff - devMigrationHistoryWindow: sliding window for counting past migrations Requires: openshift/descheduler#591 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Enable the KubevirtMigrationAware plugin as filter and pre-eviction filter in the KubeVirtRelieveAndMigrate profile. The filter hard-blocks eviction of VMs with an in-progress migration; the pre-eviction filter applies an adaptive cooldown after migration completes to prevent cascading re-evictions. Introduce three new ProfileCustomizations fields to tune the plugin: - devMigrationCooldown: base cooldown after a migration completes - devMaxMigrationCooldown: upper bound on the exponential backoff - devMigrationHistoryWindow: sliding window for counting past migrations Requires: openshift/descheduler#591 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Description
Adds a new EvictorPlugin that makes the descheduler aware of KubeVirt live-migration state when deciding whether to evict virt-launcher pods.
Filter (hard block): prevents eviction of pods whose VMI has a migration in progress (
startTimestampset,endTimestampabsent in migrationState). KubeVirt's own admission webhook provides a complementary safety net at the API layer; this plugin acts upstream of it to avoid the round-trip.PreEvictionFilter (soft block): defers eviction of pods whose VMI completed a migration recently, using a three-layer adaptive cooldown:
Defaults:
migrationCooldown=15mmaxMigrationCooldown=6hmigrationHistoryWindow=24hAll three are operator-configurable.
Both extension points read from a dedicated dynamic VMI informer cache (kubevirt.io/v1 VirtualMachineInstances), avoiding API-server calls in the hot eviction path. An UpdateFunc event handler on the same informer records migration completions by VMI UID to drive the backoff history.
The cache warms up at startup with a 30s timeout; failure to sync is a hard error so the descheduler does not start with stale or empty state.
Two Prometheus metrics are registered on first use:
counter — tracks eviction blocks for alerting and per-node diagnosis.
distribution of applied cooldown durations across backoff buckets
(15m, 30m, 1h, 2h, 4h, 6h) so operators can tell whether the backoff
is engaging or VMs are piling up at the cap.
All code paths that cannot retrieve or parse VMI state fail open (allow eviction) so the plugin never blocks unrelated workloads.
Unit tests cover Filter, PreEvictionFilter (base cooldown, adaptive duration, exponential backoff, maxMigrationCooldown cap), migration history recording and pruning, informer event handler, defaults, and validation. No kubevirt imports are required: VMI state is expressed as plain *unstructured.Unstructured objects, exactly as the dynamic informer delivers them at runtime.
Checklist
Please ensure your pull request meets the following criteria before submitting
for review, these items will be used by reviewers to assess the quality and
completeness of your changes: