Skip to content

CAMEL-23234: JTA Shutdown rollback#22204

Merged
Croway merged 2 commits intoapache:mainfrom
Croway:jta-shutdown
Mar 24, 2026
Merged

CAMEL-23234: JTA Shutdown rollback#22204
Croway merged 2 commits intoapache:mainfrom
Croway:jta-shutdown

Conversation

@Croway
Copy link
Contributor

@Croway Croway commented Mar 23, 2026

JIRA: CAMEL-23234

Description

Problem: When a Camel application receives a graceful shutdown while a transacted route is mid-processing (e.g., during a long-running stored procedure or delay that exceeds the grace period), in-flight transactions are committed instead of rolled back. The container destroys the connection pool during shutdown, but the transaction has already committed partial work.

Fix: The TransactionErrorHandler (both camel-jta and camel-spring) now tracks in-flight transacted exchanges. When DefaultShutdownStrategy triggers a forced shutdown after the grace period expires, prepareShutdown(forced=true) marks all in-flight exchanges as rollbackOnly. Additionally, after processByErrorHandler() returns, a preparingShutdown check sets rollbackOnly on any exchange that completed processing after the shutdown flag was set. Both mechanisms cause the transaction policy to throw, triggering a rollback before the connection pool is destroyed.

@github-actions
Copy link
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run
  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
  • You can label PRs using build-all, build-dependents, skip-tests and test-dependents to fine-tune the checks executed by this PR.
  • Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

Copy link
Contributor

@gnodet gnodet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code on behalf of Guillaume Nodet

Nice fix — the dual mechanism (in-flight set + post-processing preparingShutdown check) covers the race window well, and preparingShutdown is properly volatile in both implementations. Tests are thorough, especially the integration test with its well-documented 10-step sequence.

Two minor items:

  1. parent/pom.xml property ordering: agroal-version is inserted between activemq6-version and activemq-artemis-version, breaking alphabetical order. It should be placed after activemq-artemis-version (since ag > ac).

  2. TransactionalClientDataSourceForcedShutdownTest.java:676: Thread.sleep(2000) to "give the exchange time to complete" is a flaky pattern. A CountDownLatch or Awaitility would be more reliable.

Copy link
Contributor

@gnodet gnodet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code on behalf of Guillaume Nodet

The fix itself looks solid — the dual mechanism (in-flight set + post-processing preparingShutdown check) is well designed and thread-safe.

However, I'd like to see two things addressed before merging:

  1. parent/pom.xml property ordering: agroal-version is inserted between activemq6-version and activemq-artemis-version, breaking alphabetical order. It should be placed after activemq-artemis-version.

  2. TransactionalClientDataSourceForcedShutdownTest.java:676Thread.sleep(2000) must go. We've spent significant effort hunting down and eliminating Thread.sleep() in tests because they are a recurring source of flaky CI builds. They either sleep too long (slow builds) or not long enough (random failures on loaded CI agents). Please replace with a proper synchronization mechanism — add a CountDownLatch that the exchange signals when processing completes, then await() it with a timeout. The other tests in this PR already use this pattern correctly, so it should be straightforward.

…await

- Move agroal-version after activemq-artemis-version for alphabetical order
- Replace Thread.sleep(2000) with executor.awaitTermination() in
  TransactionalClientDataSourceForcedShutdownTest to avoid flaky tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@gnodet gnodet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code on behalf of Guillaume Nodet

LGTM, thanks for the fix!

@Croway Croway merged commit a4aae0b into apache:main Mar 24, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants