Skip to content

BestPossibleExternalViewVerifier Fails to Accurately Verify WAGED Resources Due to Incomplete Cluster State Consideration #2938

@MarkGaox

Description

@MarkGaox

Describe the bug

We observed that after the cluster converged, the BestPossibleExternalViewVerifier still failed to verify the state of certain resources. After enabling DEBUG logging, we found that the BestPossibleOutput for these resources contained 4 replicas, whereas the actual IdealState had only 3 replicas, suggesting the cluster had not fully converged.

This issue occurs because the BestPossibleExternalViewVerifier computes the BestPossibleOutput only for the specified resources, without accounting for the rest of the resources in the cluster. Since these resources are on WAGED, running the BestPossibleCalcStage in a dry run results in an output that differs from the cluster's actual IdealState.

Steps to Reproduce

To reproduce the issue:

  1. Create a cluster with several resources and enable WAGED rebalancing on them.
  2. Use BestPossibleExternalViewVerifier.Builder to create a BestPossibleExternalViewVerifier instance.
  3. Provide only one resource by using builder.setResources(resource).
  4. The verifier will fail to verify because the BestPossibleCalcStage() is dry-run based on the assumption that only the specified resources exist, ignoring the rest of the cluster.

Expected Behavior

When there are WAGED resources in the cluster, the BestPossibleExternalViewVerifier should compute the full IdealState of the actual cluster and then return only the portion relevant to the user's request. In this way, it can compute the correct IdealState of the WAGED resources.

Additional context

Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions