Skip to content

CAMEL-21438: De-flake timing-sensitive component tests#24338

Merged
davsclaus merged 5 commits into
apache:mainfrom
ammachado:CAMEL-21438
Jun 30, 2026
Merged

CAMEL-21438: De-flake timing-sensitive component tests#24338
davsclaus merged 5 commits into
apache:mainfrom
ammachado:CAMEL-21438

Conversation

@ammachado

@ammachado ammachado commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Description

Part of the long-standing flaky-tests effort (CAMEL-21438). These are test-only changes that remove timing races in tests; no component behavior changes.

camel-atom

The atom consumer is a scheduled poller, and several tests asserted message counts that depended on wall-clock scheduling.

  • AtomPollingConsumerIdleMessageTest wrapped MockEndpoint.assertIsSatisfied (which has its own multi-second wait) inside Awaitility.await().atMost(500ms), so two waiting mechanisms fought each other and the fixed 500ms deadline assumed the scheduler thread pool warmed up instantly. Replaced with the idiomatic expectedMinimumMessageCount(2) + assertIsSatisfied().
  • Count-based tests (AtomEntryPollingConsumerTest, AtomPollingConsumerTest, AtomPollingLowDelayTest, AtomPollingUnthrottledTest) asserted an exact message count against a consumer that polls forever. The counts held only because the unset delay defaulted to 60s (FeedPollingConsumer.DEFAULT_CONSUMER_DELAY), so a single batch poll happened to fit inside the assertion window. Bounded the poll count explicitly with repeatCount so the counts hold by construction, and dropped the tight setResultWaitTime(3000L) caps in favor of the default wait.

The idempotency/GoodBlog* tests are intentionally left unchanged: they need multiple re-read polls to prove deduplication.

camel-pg-replication-slot

PgReplicationSlotCamelIT expects six decoded messages (BEGIN/INSERT/COMMIT × 2). The consumer delivers one message per poll, and with the default scheduled-poll cadence (1s initial delay, 500ms thereafter) the sixth message arrives around 3.5s, so assertIsSatisfied(5000) raced its own 5s budget. Reproduced locally at 5.17s / 5.20s (grazing the limit). Fixed by polling faster (initialDelay=200&delay=200, messages now arrive in ~1.2s) and using the default assert wait so the budget comfortably exceeds the work; verified at 2.65s. Also removed a misleading expectedMessageCount(1) that expectedBodiesReceived already overrides with the exact body count.

camel-elasticsearch-rest-client

ElasticsearchRestClientComponentIT slept a fixed Thread.sleep(5000) hoping the Elasticsearch native security realm was ready, then ran the early create-index/index/get-by-id operations without any retry. Under load ES is not ready in time, those un-retried operations get a 401, and the test fails. Replaced the sleep with an Awaitility probe that polls _cluster/health until an authenticated request returns 200, so it waits for actual readiness (robust under load, fast when ready) and no longer uses Thread.sleep.

Target

  • I checked that the commit is targeting the correct branch (Camel 4 uses the main branch)

Tracking

  • If this is a large change, bug fix, or code improvement, I checked there is a JIRA issue filed for the change (usually before you start working on it).

Apache Camel coding standards and style

  • I checked that each commit in the pull request has a meaningful subject line and body.
  • I have run mvn clean install -DskipTests locally from root folder and I have committed all auto-generated changes.

AI-assisted contributions

  • If this PR includes AI-generated code, commits have proper co-authorship attribution (e.g., Co-authored-by trailers) and the PR description identifies the AI tool used.

Claude Code on behalf of Adriano Machado

🤖 Generated with Claude Code

ammachado and others added 2 commits June 30, 2026 12:27
Replace the nested Awaitility(500ms) + MockEndpoint.assertIsSatisfied
pattern with a single expectedMinimumMessageCount/assertIsSatisfied call.
The fixed 500ms deadline assumed the scheduler thread pool warmed up
instantly; under CI load the first polls could miss the window and the
test failed intermittently. Letting assertIsSatisfied do the waiting
keeps the happy path fast while tolerating a slow start.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bound polling with repeatCount so the exact message counts hold by
construction instead of relying on the 60s default poll delay to fit a
single batch poll inside the assertion window. Also drop the tight 3s
result-wait caps in AtomPollingLowDelayTest and AtomPollingUnthrottledTest
in favor of the default wait, so a slow scheduler under CI load no longer
causes false timeouts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run
  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
  • You can label PRs using skip-tests and test-dependents to fine-tune the checks executed by this PR.
  • Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

The consumer delivers one decoded message per poll; with the default 1s
initial delay and 500ms cadence the six expected messages take ~3.5s, so
assertIsSatisfied(5000) raced its own budget and failed intermittently
under load. Poll faster (initialDelay/delay=200ms) so the messages arrive
in ~1.2s, and use the default assert wait so the budget comfortably
exceeds the work. Also drop the misleading expectedMessageCount(1) that
expectedBodiesReceived already overrides with the exact body count.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

🧪 CI tested the following changed modules:

  • components/camel-atom
  • components/camel-elasticsearch-rest-client
  • components/camel-pg-replication-slot
All tested modules (12 modules)
  • Camel :: Atom
  • Camel :: ElasticSearch Rest Client
  • Camel :: JBang :: MCP
  • Camel :: JBang :: Plugin :: MCP
  • Camel :: JBang :: Plugin :: Route Parser
  • Camel :: JBang :: Plugin :: TUI
  • Camel :: JBang :: Plugin :: Validate
  • Camel :: Launcher :: Container
  • Camel :: PgReplicationSlot
  • Camel :: RSS
  • Camel :: YAML DSL :: Validator
  • Camel :: YAML DSL :: Validator Maven Plugin

⚙️ View full build and test results

@ammachado ammachado changed the title CAMEL-21438: Improve camel-atom test health (de-flake polling consumer tests) CAMEL-21438: De-flake timing-sensitive component tests (camel-atom, camel-pg-replication-slot) Jun 30, 2026
…chRestClientComponentIT

The test slept a fixed 5s hoping the Elasticsearch native security realm
was ready, then ran the early create-index/index/get-by-id operations
without any retry. Under load ES is not ready in time, those un-retried
operations get a 401, and the test fails. Replace the sleep with an
Awaitility probe that polls _cluster/health until an authenticated request
succeeds, so the test waits for actual readiness (robust under load, fast
when ready) and no longer uses Thread.sleep.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ammachado ammachado changed the title CAMEL-21438: De-flake timing-sensitive component tests (camel-atom, camel-pg-replication-slot) CAMEL-21438: De-flake timing-sensitive component tests Jun 30, 2026
@ammachado ammachado marked this pull request as ready for review June 30, 2026 17:11
Soften the wording to reflect that the default poll schedule only races
the 5s timeout under CI load, not in the happy path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@davsclaus davsclaus left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: CAMEL-21438 — De-flake timing-sensitive component tests

Clean, well-documented set of test-only changes. Each fix addresses a real timing race with a clear root cause explanation.

This review evaluates the PR against project rules and conventions. It does not replace specialized AI review tools (CodeRabbit, Sourcery) or static analysis (SonarCloud).

Verified against project conventions

  • Commit format: All 5 commits follow CAMEL-21438: ... format. ✅
  • Branch name: CAMEL-21438 follows the fix branch convention. ✅
  • JIRA linkage: Properly references CAMEL-21438. ✅
  • AI attribution: Present in PR body and commits. ✅
  • Thread.sleep: Removes Thread.sleep(5000) in Elasticsearch test, replaces with Awaitility — directly follows the project's Awaitility mandate. ✅
  • No production code changes: All changes are test-only. ✅

Git history confirms no conflicts with prior work

  • AtomPollingConsumerIdleMessageTest: The existing Awaitility usage (added in 7239aff82c98) was flawed — it wrapped assertMockEndpointsSatisfied inside a 500ms Awaitility deadline, creating two competing wait mechanisms. This PR correctly removes the unnecessary Awaitility wrapper.
  • Elasticsearch Thread.sleep(5000): Added as an explicit workaround in 7d63d63bd8d2 (CAMEL-22111) which itself noted "Maybe there could be a way to ensure that everything is ready correctly with a more precise/different pattern." This PR provides that proper pattern — an Awaitility probe against _cluster/health.
  • Atom count-based tests: Using repeatCount to bound poll cycles makes counts deterministic by construction rather than relying on timing.

No issues found. Nice work stabilizing these tests.

This review was generated by an AI agent and may contain inaccuracies. Please verify all suggestions before applying.

@davsclaus davsclaus merged commit 3b426d1 into apache:main Jun 30, 2026
5 checks passed
@ammachado ammachado deleted the CAMEL-21438 branch July 1, 2026 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants