Skip to content

[fix][test] Fix flaky PersistentStickyKeyDispatcherMultipleConsumersClassicTest.testSkipRedeliverTemporally#25385

Merged
merlimat merged 1 commit intoapache:masterfrom
lhotari:worktree-zippy-wishing-meerkat
Mar 23, 2026
Merged

[fix][test] Fix flaky PersistentStickyKeyDispatcherMultipleConsumersClassicTest.testSkipRedeliverTemporally#25385
merlimat merged 1 commit intoapache:masterfrom
lhotari:worktree-zippy-wishing-meerkat

Conversation

@lhotari
Copy link
Copy Markdown
Member

@lhotari lhotari commented Mar 22, 2026

Motivation

Fix flaky testSkipRedeliverTemporally in PersistentStickyKeyDispatcherMultipleConsumersClassicTest.

The test was flaky because the asyncReadEntriesOrWait mock returned the same entries (msg1, msg2 for the slow consumer) indefinitely, creating an infinite normal-read loop. Normal reads couldn't deliver to the slow consumer because getRestrictedMaxEntriesForConsumer returns 0 when readType==Normal and the stickyKeyHashes are in redeliveryMessages. Only replay reads can deliver these messages, but the isDispatcherStuckOnReplays flag prevented replay reads from being attempted.

In a real cursor, asyncReadEntriesOrWait advances the read position — entries can only be re-read through asyncReplayEntries. The mock didn't simulate this behavior.

Modifications

  1. Realistic cursor mockasyncReadEntriesOrWait now tracks which entries it has already returned and doesn't return them again (simulating cursor read position advancement). When no new entries exist, it doesn't call the callback (simulating "OrWait" behavior), which stops the infinite normal-read loop.

  2. Explicit readMoreEntriesAsync() call — After setting the slow consumer's permits, the test triggers a new dispatch cycle. This mimics what consumerFlow does in production when a consumer sends more permits. With the loop stopped and isDispatcherStuckOnReplays never set, the dispatcher correctly enters the replay read path and delivers messages to the slow consumer.

Verifying this change

This change is already covered by existing tests:

  • PersistentStickyKeyDispatcherMultipleConsumersClassicTest.testSkipRedeliverTemporally — the test itself, verified passing 5/5 consecutive runs
  • All 7 tests in PersistentStickyKeyDispatcherMultipleConsumersClassicTest pass

Does this pull request potentially affect one of the following parts:

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc-not-needed

Matching PR in forked repository

PR in forked repository:

…stic cursor behavior

The test was flaky because the asyncReadEntriesOrWait mock returned
the same entries indefinitely (entries for the slow consumer were never
added to alreadySent since they couldn't be delivered). This created an
infinite normal-read loop where the isDispatcherStuckOnReplays flag
prevented replay reads from ever being attempted.

In a real cursor, asyncReadEntriesOrWait advances the read position -
entries can only be re-read through asyncReplayEntries. The fix makes
the mock track already-returned entries and simulate "wait" behavior
when no new entries are available (by not calling the callback).

Additionally, after setting the slow consumer's permits, the test now
explicitly triggers readMoreEntriesAsync() to kick off a new dispatch
cycle, which is what consumerFlow does in production when a consumer
sends more permits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Mar 22, 2026
@lhotari lhotari requested a review from merlimat March 22, 2026 20:37
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.75%. Comparing base (a27988b) to head (780f7e0).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #25385      +/-   ##
============================================
- Coverage     72.81%   72.75%   -0.07%     
- Complexity    33791    33887      +96     
============================================
  Files          1954     1954              
  Lines        154793   154857      +64     
  Branches      17731    17740       +9     
============================================
- Hits         112716   112662      -54     
- Misses        33066    33129      +63     
- Partials       9011     9066      +55     
Flag Coverage Δ
inttests 25.75% <ø> (-0.16%) ⬇️
systests 22.50% <ø> (-0.08%) ⬇️
unittests 73.72% <ø> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 93 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@merlimat merlimat merged commit 953d092 into apache:master Mar 23, 2026
150 of 156 checks passed
@lhotari lhotari added this to the 4.2.0 milestone Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants