A flexible implementation of the Transaction Outbox Pattern for Java. TransactionOutbox
has a clean, extensible API, very few dependencies and plays nicely with a variety of database platforms, transaction management approaches and application frameworks. Every aspect is highly configurable or overridable. It features out-of-the-box support for Spring DI, Spring Txn, Guice, MySQL 5 & 8, PostgreSQL 9-12 and H2.
- Why do I need it?
- Installation
- Basic Configuration
- Set up the background worker
- Managing the "dead letter queue"
- Advanced
- Configuration reference
- Stubbing in tests
This article explains the concept in an abstract manner, but let's say we have a microservice that handles point-of-sale and need to implement a REST endpoint to record a sale. We end up with this:
@POST
@Path("/sales")
@Transactional
public SaleId createWidget(Sale sale) {
var saleId = saleRepository.save(sale);
messageQueue.postMessage(StockReductionEvent.of(sale.item(), sale.amount()));
messageQueue.postMessage(IncomeEvent.of(sale.value()));
return saleId;
}
The SaleRepository
handles recording the sale in the customer's account, the StockReductionEvent
goes off to our warehouse service, and the IncomeEvent
goes to our financial records service (let's ignore the potential flaws in the domain modelling for now).
There's a big problem here: the @Transactional
annotation is a lie (no, really). It only really wraps the SaleRepository
call, but not the two event postings. This means that we could end up sending the two events and fail to actually commit the sale. Our system is now inconsistent.
We could make whole method idempotent and re-write it to work a bit more like this:
@PUT
@Path("/sales/{id}")
public void createWidget(@PathParam("id") SaleId saleId, Sale sale) {
saleRepository.saveInNewTransaction(saleId, sale);
messageQueue.postMessage(StockReductionEvent.of(saleId, sale.item(), sale.amount()));
messageQueue.postMessage(IncomeEvent.of(saleId, sale.value()));
}
This is better. As long as the caller keeps calling the method until they get a success, we can keep re-saving and re-sending the messages without any risk of duplicating work. This works regardless of the order of the calls (and in any case, there may be good reasons of referential integrity to fix the order).
The problem is that they might stop trying, and if they do, we could end up with only part of this transaction completed. If this is a public API, we can't force clients to use it correctly.
We also still have another problem: external calls are inherently more vulnerable to downtime and performance degredation. We could find our service rendered unresponsive or failing if they are unavailable. Ideally, we would like to "buffer" these external calls within our service safely until our downstream dependencies are available.
Idempotency is a good thing, so let's stick with the PUT
. Here is the same example, using Transaction Outbox:
@PUT
@Path("/sales/{id}")
@Transactional
public void createWidget(@PathParam("id") SaleId saleId, Sale sale) {
saleRepository.save(saleId, sale);
MessageQueue proxy = transactionOutbox.schedule(MessageQueue.class);
proxy.postMessage(StockReductionEvent.of(saleId, sale.item(), sale.amount()));
proxy.postMessage(IncomeEvent.of(saleId, sale.value()));
}
Here's what happens:
- When you create an instance of
TransactionOutbox
(see Basic Configuration), it will, by default, automatically create two database tables,TXNO_OUTBOX
andTXNO_VERSION
, and then keep these synchronized with schema changes as new versions are released. Note: this is the default behaviour on a SQL database, but is completely overridable if you are using a different type of data store or don't want a third party library managing your database schema. See Configuration reference. TransactionOutbox
creates a proxy ofMessageQueue
. Any method calls on the proxy are serialized and written to theTXNO_OUTBOX
table (by default) in the same transaction as theSaleRepository
call. The call returns immediately rather than actually invoking the real method.- If the transaction rolls back, so do the serialized requests.
- Immediately after the transaction is successfully committed, another thread will attempt to make the real call to
MessageQueue
asynchronously. - If that call fails, or the application dies before the call is attempted, a background "mop-up" thread will re-attempt the call a configurable number of times, with configurable time between each, before blocking the request and firing and event for it to be investigated (similar to a dead letter queue).
- Blocked requests can be easily unblocked again once the underlying issue is resolved.
Our service is now resilient and explicitly eventually consistent, as long as all three elements (SaleRepository
and the downstream event handlers) are idempotent, since those messages will be attempted repeatedly until confirmed successful, which means they could occur multiple times.
If you find yourself wondering why bother with the queues now? You're quite right. As we now have outgoing buffers, we already have most of the benefits of middleware (at least for some use cases). We could replace the calls to a message queue with direct queues to the other services' load balancers and switch to a peer-to-peer architecture, if we so choose.
Note that for the above example to work,
StockReductionEvent
andIncomeEvent
need to be included for serialization. See Configuration reference.
- At least Java 11. Downgrading to requiring Java 8 is under consideration.
- Currently, MySQL, PostgreSQL or H2 databases (pull requests to support Oracle, SQL Server or any other traditional RDMBS would be trivial. Beyond that, a SQL database is not strictly necessary for the pattern to work, merely a data store with the concept of a transaction spanning multiple mutation operations).
- Database access via JDBC (In principle, JDBC should not be required - alternatives such as R2DBC are under investigation - but the API is currently tied to it)
- Native transactions (not JTA or similar).
The latest stable release is available from Maven Central. Stable releases are sort-of semantically versioned. That is, they follow semver in every respect other than that the version numbers are not monotically increasing. The project uses continuous delivery and selects individual stable releases to promote to Central, so Central releases will always be spaced apart numerically. The important thing, though, is that dependencies should be safe to upgrade as long as the major version number has not increased.
<dependency>
<groupId>com.gruelbox</groupId>
<artifactId>transactionoutbox-core</artifactId>
<version>4.1.230</version>
</dependency>
implementation 'com.gruelbox:transactionoutbox-core:4.1.230'
Maven Central is updated regularly. However, if you want to stay at the bleeding edge, you can use continuously-delivered releases from Github Package Repository. These can be used from your production builds since they will never be deleted (unlike SNAPSHOT
s).
<repositories>
<repository>
<id>github-transaction-outbox</id>
<name>Gruelbox Github Repository</name>
<url>https://maven.pkg.github.com/gruelbox/transaction-outbox</url>
</repository>
</repositories>
You will need to authenticate with Github to use Github Package Repository. Create a personal access token in your GitHub settings. It only needs read:package permissions. Then add something like the following in your Maven settings.xml
:
<servers>
<server>
<id>github-transaction-outbox</id>
<username>${env.GITHUB_USERNAME}</username>
<password>${env.GITHUB_TOKEN}</password>
</server>
</servers>
The above example uses environment variables, allowing you to keep the credentials out of source control, but you can hard-code them if you know what you're doing.
repositories {
maven {
name = "github-transaction-outbox"
url = uri("https://maven.pkg.github.com/gruelbox/transaction-outboxY")
credentials {
username = $githubUserName
password = $githubToken
}
}
}
An application needs a single, shared instance of TransactionOutbox
, which is configured using a builder on construction. This takes some time to get right, particularly if you already have a transaction management solution in your application.
If you have no existing transaction management, connection pooling or dependency injection, here's a quick way to get started:
// Use an in-memory H2 database
TransactionManager transactionManager = TransactionManager.fromConnectionDetails(
"org.h2.Driver", "jdbc:h2:mem:test;MV_STORE=TRUE", "test", "test"));
// Create the outbox
TransactionOutbox outbox = TransactionOutbox.builder()
.transactionManager(transactionManager)
.persistor(Persistor.forDialect(Dialect.H2))
.build();
// Start a transaction
transactionManager.inTransaction(tx -> {
// Save some stuff
tx.connection().createStatement().execute("INSERT INTO...");
// Create an outbox request
outbox.schedule(MyClass.class).myMethod("Foo", "Bar"));
});
Alternatively, you could create the TransactionManager
from a DataSource
, allowing you to use a connection pooling DataSource
such as Hikari:
TransactionManager transactionManager = TransactionManager.fromDataSource(dataSource);
In this default configuration, MyClass
must have a default constructor so the "real" implementation can be constructed at the point the method is actually invoked (which might be on another day on another instance of the application). However, you can avoid this requirement by providing an Instantiator
on every instance of your application that knows how to create the objects:
TransactionOutbox outbox = TransactionOutbox.builder()
.instantiator(Instantiator.using(clazz -> createInstanceOf(clazz)))
.build();
See transaction-outbox-spring, which integrates Spring's DI and/or transaction management with TransactionOutbox
.
See transaction-outbox-guice, which integrates Guice DI TransactionOutbox
.
See transaction-outbox-jooq, which integrates jOOQ transaction management with TransactionOutbox
.
At the moment, if any work fails first time, it won't be retried. All we need to add is a background thread that repeatedly calls TransactionOutbox.flush()
to pick up and reprocess stale work.
How you do this is up to you; it very much depends on how background processing works in your application (a reactive solution will be very different to one based on Guava Service
, for example). However, here is a simple example:
Thread backgroundThread = new Thread(() -> {
while (!Thread.interrupted()) {
try {
// Keep flushing work until there's nothing left to flush
while (outbox.flush()) {}
} catch (Exception e) {
log.error("Error flushing transaction outbox. Pausing", e);
}
try {
// When we run out of work, pause for a minute before checking again
Thread.sleep(60_000);
} catch (InterruptedException e) {
break;
}
}
});
// Startup
backgroundThread.start();
// Shut down
backgroundThread.interrupt();
backgroundThread.join();
Don't worry about it running on multiple instances simultaneously. It's designed to handle concurrent use (particularly on databases that support SKIP LOCKED
, such as Postgres and MySQL 8+), and indeed multi-processing it can be a benefit; spreading high workloads across instances without any need for more complex high-availability configuration (that said, if you want to distribute work across a cluster at point of submission, this is also supported).
Work might be retried too many times and enter a blocked state. You should set up an alert to allow you to manage this when it occurs, resolve the issue and unblock the work, since the work not being complete will usually be a sign that your system is out of sync in some way.
TransactionOutbox.builder()
...
.listener(new TransactionOutboxListener() {
@Override
public void blocked(TransactionOutboxEntry entry, Throwable cause) {
// Spring example
applicationEventPublisher.publishEvent(new TransactionOutboxBlockedEvent(entry.getId(), cause);
}
})
.build();
To mark the work for reprocessing, just use TransactionOutbox.unblock()
. Its failure count will be marked back down to zero and it will get reprocessed on the next call to flush()
:
transactionOutboxEntry.unblock(entryId);
Or if using a TransactionManager
that relies on explicit context (such as a non-thread local JooqTransactionManager
):
transactionOutboxEntry.unblock(entryId, context);
A good approach here is to use the TransactionOutboxListener
callback to post an interactive Slack message - this can operate as both the alert and the "button" allowing a support engineer to submit the work for reprocessing.
In practice it can be extremely hard to guarantee that an entire unit of work is idempotent and thus suitable for retry. For example, the request might be to "update a customer record" with a new address, but this might record the change to an audit history table with a fresh UUID, the current date and time and so on, which in turn triggers external changes outside the transaction. The parent customer update request may be idempotent, but the downstream effects may not be.
To tackle this, TransactionOutbox
supports a use case where outbox requests spawn further outbox requests, along with a layer of additional idempotency protection for particularly diffcult cases. The nested pattern works as follows:
- Modify the customer record:
outbox.schedule(CustomerService.class).update(newDetails)
- The
update
method spawns a new outbox request to process the downstream effect:outbox.schedule(AuditService.class).audit("CUSTOMER_UPDATED", UUID.randomUUID(), Instant.now(), newDetails.customerId())
Now, if any part of the top-level request throws, nothing occurs. If the top level request succeeds, an idempotent request to create the audit record will retry safely.
A common use case for TransactionOutbox
is to receive an incoming request (such as a message from a message queue), acknowledge it immediately and process it asynchronously, for example:
public class FooEventHandler implements SQSEventHandler<ThingHappenedEvent> {
@Inject private TransactionOutbox outbox;
public void handle(ThingHappenedEvent event) {
outbox.schedule(FooService.class).handleEvent(event.getThingId());
}
}
However, incoming transports, whether they be message queues or APIs, usually need to rely on idempotency in message handlers (for the same reason that outgoing requests from outbox also rely on idempotency). This means the above code could get called twice.
As long as FooService.handleEvent()
is idempotent itself, this is harmless, but we can't always assume this. The incoming message might be a broadcast, with no knowledge of the behaviour of handlers and therefore no way of pre-generating any new record ids the handler might need and passing them in the message.
To protect against this, TransactionOutbox
can automatically detect duplicate requests and reject them with AlreadyScheduledException
. Records of requests are retained up to a configurable threshold (see below).
To use this, use the call pattern:
outbox.with()
.uniqueRequestId("context-clientid")
.schedule(Service.class)
.process("Foo");
Where context-clientid
is a globally-unique identifier derived from the incoming request. Such ids are usually available from queue middleware as message ids, or if not you can require as part of the incoming API (possibly with a tenant prefix to ensure global uniqueness across tenants).
This example shows a number of other configuration options in action:
TransactionManager transactionManager = TransactionManager.fromDataSource(dataSource);
TransactionOutbox outbox = TransactionOutbox.builder()
// The most complex part to set up for most will be synchronizing with your existing transaction
// management. Pre-rolled implementations are available for jOOQ and Spring (see above for more information)
// and you can use those examples to synchronize with anything else by defining your own TransactionManager.
// Or, if you have no formal transaction management at the moment, why not start, using transaction-outbox's
// built-in one?
.transactionManager(transactionManager)
// Modify how requests are persisted to the database. For more complex modifications, you may wish to subclass
// DefaultPersistor, or create a completely new Persistor implementation.
.persistor(DefaultPersistor.builder()
// Selecting the right SQL dialect ensures that features such as SKIP LOCKED are used correctly.
.dialect(Dialect.POSTGRESQL_9)
// Override the table name (defaults to "TXNO_OUTBOX")
.tableName("transactionOutbox")
// Shorten the time we will wait for write locks (defaults to 2)
.writeLockTimeoutSeconds(1)
// Disable automatic creation and migration of the outbox table, forcing the application to manage
// migrations itself
.migrate(false)
// Allow the SaleType enum and Money class to be used in arguments (see example below)
.serializer(DefaultInvocationSerializer.builder()
.serializableTypes(Set.of(SaleType.class, Money.class))
.build())
.build())
// GuiceInstantiator and SpringInstantiator are great if you are using Guice or Spring DI, but what if you
// have your own service locator? Wire it in here. Fully-custom Instantiator implementations are easy to
// implement.
.instantiator(Instantiator.using(myServiceLocator::createInstance))
// Change the log level used when work cannot be submitted to a saturated queue to INFO level (the default
// is WARN, which you should probably consider a production incident). You can also change the Executor used
// for submitting work to a shared thread pool used by the rest of your application. Fully-custom Submitter
// implementations are also easy to implement.
.submitter(ExecutorSubmitter.builder()
.executor(ForkJoinPool.commonPool())
.logLevelWorkQueueSaturation(Level.INFO)
.build())
// Lower the log level when a task fails temporarily from the default WARN.
.logLevelTemporaryFailure(Level.INFO)
// 10 attempts at a task before blocking it.
.blockAfterAttempts(10)
// When calling flush(), select 0.5m records at a time.
.flushBatchSize(500_000)
// Flush once every 15 minutes only
.attemptFrequency(Duration.ofMinutes(15))
// Include Slf4j's Mapped Diagnostic Context in tasks. This means that anything in the MDC when schedule()
// is called will be recreated in the task when it runs. Very useful for tracking things like user ids and
// request ids across invocations.
.serializeMdc(true)
// Sets how long we should keep records of requests with a unique request id so duplicate requests
// can be rejected. Defaults to 7 days.
.retentionThreshold(Duration.ofDays(1))
// We can intercept task successes, single failures and blocked tasks. The most common use is to catch blocked tasks
// and raise alerts for these to be investigated. A Slack interactive message is particularly effective here
// since it can be wired up to call unblock() automatically.
.listener(new TransactionOutboxListener() {
@Override
public void success(TransactionOutboxEntry entry) {
eventPublisher.publish(new OutboxTaskProcessedEvent(entry.getId()));
}
@Override
public void blocked(TransactionOutboxEntry entry, Throwable cause) {
eventPublisher.publish(new BlockedOutboxTaskEvent(entry.getId()));
}
})
.build();
// Usage example, using the in-built transaction manager
MDC.put("SESSIONKEY", "Foo");
try {
transactionManager.inTransaction(tx -> {
writeSomeChanges(tx.connection());
outbox.schedule(getClass())
.performRemoteCall(SaleType.SALE, Money.of(10, Currency.getInstance("USD")));
});
} finally {
MDC.clear();
}
TransactionOutbox
should not be directly stubbed (e.g. using Mockito); the contract is too complex to stub out.
Instead, stubs exist for the various arguments to the builder, allowing you to build a TransactionOutbox
with minimal external dependencies which can be called and verified in tests.
// GIVEN
SomeService mockService = Mockito.mock(SomeService.class);
// Also see StubParameterContextTransactionManager
TransactionManager transactionManager = new StubThreadLocalTransactionManager();
TransactionOutbox outbox = TransactionOutbox.builder()
.instantiator(Instantiator.using(clazz -> mockService)) // Return our mock
.persistor(StubPersistor.builder().build()) // Doesn't save anything
.submitter(Submitter.withExecutor(MoreExecutors.directExecutor())) // Execute all work in-line
.clockProvider(() ->
Clock.fixed(LocalDateTime.of(2020, 3, 1, 12, 0)
.toInstant(ZoneOffset.UTC), ZoneOffset.UTC)) // Fix the clock (not necessary here)
.transactionManager(transactionManager)
.build();
// WHEN
transactionManager.inTransaction(tx ->
outbox.schedule(SomeService.class).doAThing(1));
// THEN
Mockito.verify(mockService).doAThing(1);
Depending on the type of test, you may wish to use a real Persistor
such as DefaultPersistor
(if there's a real database available) or a real, multi-threaded Submitter
. That's up to you.