Skip to content

DRS plan generation fails for entire cluster when a VM uses host passthrough #13098

@raniers1

Description

@raniers1

problem

When automated DRS is enabled and a cluster contains a VM with a vGPU
passthrough profile, the DRS plan generation fails for the entire cluster
with an unhandled InvalidParameterValueException. No migrations are planned
for any VM in the cluster, even for the ones that are perfectly migratable.

Error log:

ERROR [o.a.c.c.ClusterDrsServiceImpl] Unable to generate DRS plans for cluster
Cluster {id: "3", name: "princeton-cluster01"}
com.cloud.exception.InvalidParameterValueException: Unsupported operation,
VM uses host passthrough, cannot migrate
at org.apache.cloudstack.cluster.ClusterDrsServiceImpl.getBestMigration(ClusterDrsServiceImpl.java:453)
at org.apache.cloudstack.cluster.ClusterDrsServiceImpl.getDrsPlan(ClusterDrsServiceImpl.java:362)
at org.apache.cloudstack.cluster.ClusterDrsServiceImpl.generateDrsPlanForAllClusters(ClusterDrsServiceImpl.java:289)
at org.apache.cloudstack.cluster.ClusterDrsServiceImpl.poll(ClusterDrsServiceImpl.java:181)

versions

ACS 4.21.0.0, KVM

The steps to reproduce the bug

  1. Enable automated DRS on a cluster (cluster.drs.enabled=true)
  2. Have at least one VM in the cluster using a GPU passthrough profile
  3. Wait for the DRS poll cycle to run

What to do about it?

In getBestMigration, catch the InvalidParameterValueException thrown by
listHostsForMigrationOfVM inside the VM loop and skip the VM with a debug
log. This allows the remaining eligible VMs to still be evaluated normally.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions