Skip to content

Flaky Test: Tester.AzureUtils.Lease.LeaseBasedQueueBalancerTests.LeaseBalancedQueueBalancer_SupportUnexpectedNodeFailureScenerio #9559

Open
@ReubenBond

Description

@ReubenBond

Description

This test has been identified as flaky in the CI pipeline.

Test:
Failure Rate: 2 failures out of 87 runs (97.7% success rate)
Average Duration: 48.32s

Test Details

  • Test Class:
  • Test Method:
  • Test Categories: Functional, AzureStorage, Lease

Test Description

This test verifies lease-based queue balancer behavior during unexpected node failures by:

  1. Starting with 6 queues and 4 silos (expecting 1-2 queues per silo)
  2. Killing one silo and verifying rebalancing (3 silos, 2 queues each)
  3. Killing another silo and verifying rebalancing (2 silos, 3 queues each)
  4. Starting a new silo and verifying rebalancing (3 silos, 2 queues each)

Test Configuration

  • Lease length: 15 seconds
  • Lease renew period: 10 seconds
  • Lease acquisition period: 10 seconds
  • Test timeout: 2 minutes

Failure Pattern

The test has a high average duration (48.32s) and involves:

  • Azure Blob lease operations
  • Cluster membership changes through silo kills
  • Waiting for lease rebalancing to occur
  • Multiple timeout-based waits for agent ownership verification

Failures may be related to:

  • Azure Storage transient failures or throttling
  • Lease acquisition/renewal timing issues
  • Cluster membership propagation delays
  • Test environment performance affecting timing assumptions

Next Steps

  • Investigate Azure Blob lease operation reliability in test environment
  • Review timeout values and consider increasing for CI environments
  • Add diagnostic logging for lease state transitions
  • Consider retry logic for transient Azure Storage failures
  • Analyze if 15-second lease length is too aggressive for tests

Related

  • Test file:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions