CAMEL-21438: De-flake timing-sensitive component tests#24338
Conversation
Replace the nested Awaitility(500ms) + MockEndpoint.assertIsSatisfied pattern with a single expectedMinimumMessageCount/assertIsSatisfied call. The fixed 500ms deadline assumed the scheduler thread pool warmed up instantly; under CI load the first polls could miss the window and the test failed intermittently. Letting assertIsSatisfied do the waiting keeps the happy path fast while tolerating a slow start. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bound polling with repeatCount so the exact message counts hold by construction instead of relying on the 60s default poll delay to fit a single batch poll inside the assertion window. Also drop the tight 3s result-wait caps in AtomPollingLowDelayTest and AtomPollingUnthrottledTest in favor of the default wait, so a slow scheduler under CI load no longer causes false timeouts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
🌟 Thank you for your contribution to the Apache Camel project! 🌟 🐫 Apache Camel Committers, please review the following items:
|
The consumer delivers one decoded message per poll; with the default 1s initial delay and 500ms cadence the six expected messages take ~3.5s, so assertIsSatisfied(5000) raced its own budget and failed intermittently under load. Poll faster (initialDelay/delay=200ms) so the messages arrive in ~1.2s, and use the default assert wait so the budget comfortably exceeds the work. Also drop the misleading expectedMessageCount(1) that expectedBodiesReceived already overrides with the exact body count. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
🧪 CI tested the following changed modules:
All tested modules (12 modules)
|
…chRestClientComponentIT The test slept a fixed 5s hoping the Elasticsearch native security realm was ready, then ran the early create-index/index/get-by-id operations without any retry. Under load ES is not ready in time, those un-retried operations get a 401, and the test fails. Replace the sleep with an Awaitility probe that polls _cluster/health until an authenticated request succeeds, so the test waits for actual readiness (robust under load, fast when ready) and no longer uses Thread.sleep. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Soften the wording to reflect that the default poll schedule only races the 5s timeout under CI load, not in the happy path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
davsclaus
left a comment
There was a problem hiding this comment.
Review: CAMEL-21438 — De-flake timing-sensitive component tests
Clean, well-documented set of test-only changes. Each fix addresses a real timing race with a clear root cause explanation.
This review evaluates the PR against project rules and conventions. It does not replace specialized AI review tools (CodeRabbit, Sourcery) or static analysis (SonarCloud).
Verified against project conventions
- Commit format: All 5 commits follow
CAMEL-21438: ...format. ✅ - Branch name:
CAMEL-21438follows the fix branch convention. ✅ - JIRA linkage: Properly references CAMEL-21438. ✅
- AI attribution: Present in PR body and commits. ✅
- Thread.sleep: Removes
Thread.sleep(5000)in Elasticsearch test, replaces with Awaitility — directly follows the project's Awaitility mandate. ✅ - No production code changes: All changes are test-only. ✅
Git history confirms no conflicts with prior work
AtomPollingConsumerIdleMessageTest: The existing Awaitility usage (added in7239aff82c98) was flawed — it wrappedassertMockEndpointsSatisfiedinside a 500ms Awaitility deadline, creating two competing wait mechanisms. This PR correctly removes the unnecessary Awaitility wrapper.- Elasticsearch
Thread.sleep(5000): Added as an explicit workaround in7d63d63bd8d2(CAMEL-22111) which itself noted "Maybe there could be a way to ensure that everything is ready correctly with a more precise/different pattern." This PR provides that proper pattern — an Awaitility probe against_cluster/health. - Atom count-based tests: Using
repeatCountto bound poll cycles makes counts deterministic by construction rather than relying on timing.
No issues found. Nice work stabilizing these tests.
This review was generated by an AI agent and may contain inaccuracies. Please verify all suggestions before applying.
Description
Part of the long-standing flaky-tests effort (CAMEL-21438). These are test-only changes that remove timing races in tests; no component behavior changes.
camel-atom
The atom consumer is a scheduled poller, and several tests asserted message counts that depended on wall-clock scheduling.
AtomPollingConsumerIdleMessageTestwrappedMockEndpoint.assertIsSatisfied(which has its own multi-second wait) insideAwaitility.await().atMost(500ms), so two waiting mechanisms fought each other and the fixed 500ms deadline assumed the scheduler thread pool warmed up instantly. Replaced with the idiomaticexpectedMinimumMessageCount(2)+assertIsSatisfied().AtomEntryPollingConsumerTest,AtomPollingConsumerTest,AtomPollingLowDelayTest,AtomPollingUnthrottledTest) asserted an exact message count against a consumer that polls forever. The counts held only because the unsetdelaydefaulted to 60s (FeedPollingConsumer.DEFAULT_CONSUMER_DELAY), so a single batch poll happened to fit inside the assertion window. Bounded the poll count explicitly withrepeatCountso the counts hold by construction, and dropped the tightsetResultWaitTime(3000L)caps in favor of the default wait.The idempotency/
GoodBlog*tests are intentionally left unchanged: they need multiple re-read polls to prove deduplication.camel-pg-replication-slot
PgReplicationSlotCamelITexpects six decoded messages (BEGIN/INSERT/COMMIT × 2). The consumer delivers one message per poll, and with the default scheduled-poll cadence (1s initial delay, 500ms thereafter) the sixth message arrives around 3.5s, soassertIsSatisfied(5000)raced its own 5s budget. Reproduced locally at 5.17s / 5.20s (grazing the limit). Fixed by polling faster (initialDelay=200&delay=200, messages now arrive in ~1.2s) and using the default assert wait so the budget comfortably exceeds the work; verified at 2.65s. Also removed a misleadingexpectedMessageCount(1)thatexpectedBodiesReceivedalready overrides with the exact body count.camel-elasticsearch-rest-client
ElasticsearchRestClientComponentITslept a fixedThread.sleep(5000)hoping the Elasticsearch native security realm was ready, then ran the earlycreate-index/index/get-by-idoperations without any retry. Under load ES is not ready in time, those un-retried operations get a 401, and the test fails. Replaced the sleep with an Awaitility probe that polls_cluster/healthuntil an authenticated request returns 200, so it waits for actual readiness (robust under load, fast when ready) and no longer usesThread.sleep.Target
mainbranch)Tracking
Apache Camel coding standards and style
mvn clean install -DskipTestslocally from root folder and I have committed all auto-generated changes.AI-assisted contributions
Co-authored-bytrailers) and the PR description identifies the AI tool used.Claude Code on behalf of Adriano Machado
🤖 Generated with Claude Code