High availability RabbitMQ client
Dealing with failure is a fact of life in distributed systems. Lyra is a RabbitMQ client that embraces failure, helping you achieve high availability in your services by automatically recovering AMQP resources when unexpected failures occur. Lyra also supports automatic invocation retries, recovery related eventing, and exposes a simple, lightweight API built around the Java AMQP client library.
Lyra was created with the simple goal of recovering client created AMQP resources from any RabbitMQ failure that could reasonably occur. While the Java AMQP client provides some automatic recovery, Lyra provides the ability to recover from any type of failure, including Channel and Consumer closures, while providing flexible recovery policies, configuration, and eventing to help get your services back online.
Automatic Resource Recovery
The key feature of Lyra is its ability to automatically recover resources such as connections, channels, consumers, exchanges, queues and bindings when unexpected failures occur. Lyra provides a flexible policy to define how recovery should be performed.
To start, create a
Config object, specifying a recovery policy:
Config config = new Config() .withRecoveryPolicy(new RecoveryPolicy() .withBackoff(Duration.seconds(1), Duration.seconds(30)) .withMaxAttempts(20));
config, let's create some recoverable resources:
ConnectionOptions options = new ConnectionOptions().withHost("localhost"); Connection connection = Connections.create(options, config); Channel channel1 = connection.createChannel(1); Channel channel2 = connection.createChannel(2); channel1.basicConsume("foo-queue", consumer1); channel1.basicConsume("foo-queue", consumer2); channel2.basicConsume("bar-queue", consumer3); channel2.basicConsume("bar-queue", consumer4);
This results in the resource topology:
If a connection or channel is unexpectedly closed, Lyra will attempt to recover it along with its dependents according to the recovery policy. In addition, any non-durable or auto-deleting exchanges and queues, along with their bindings, will be recovered unless they are explicitly deleted.
Automatic Invocation Retries
Lyra also supports invocation retries when a retryable failure occurs while creating a Connection or invoking a method against a Connection or Channel. Similar to recovery, retries are also performed according to a policy:
Config config = new Config() .withRecoveryPolicy(RecoveryPolicies.recoverAlways()) .withRetryPolicy(new RetryPolicy() .withMaxAttempts(20) .withInterval(Duration.seconds(1)) .withMaxDuration(Duration.minutes(5))); ConnectionOptions options = new ConnectionOptions().withHost("localhost"); Connection connection = Connections.create(options, config); Channel channel = connection.createChannel(); channel.basicConsume("foo-queue", myConsumer);
Here we've created a new
Channel, specifying a recovery policy to use in case any of our resources are unexpectedly closed as a result of an invocation failure, and a retry policy that dictates how and when the failed method invocation should be retried. If any method invocation such as
channel.basicConsume() fails as the result of a retryable error, Lyra will attempt to recover any resources that were closed according to the recovery policy and retry the invocation according to the retry policy.
Lyra allows for resource configuration to be applied at different levels. For example, global recovery and global retry policies can be configured for all resources. These policies can be overriden with specific policies for connection attempts, connections and channels. Lyra also allows for individual connections and channels to be re-configured after creation:
ConfigurableConnection configurableConnection = Config.of(connection); ConfigurableChannel configurableChannel = Config.of(channel);
Lyra offers listeners for creation and recovery events:
Config config = new Config(); .withConnectionListeners(myConnectionListener) .withChannelListeners(myChannelListener) .withConsumerListeners(myConsumerListener);
Event listeners can be useful for setting up additional resources during recovery, such as auto-deleted exchanges and queues.
On Recovery and Retry Policies
- The maximum number of attempts to perform
- The maxmimum duration that attempts should be performed for
- The interval between attempts
- The maximum interval between attempts to exponentially backoff to
Lyra allows for recovery and retry policies to be set globally, for individual resource types, and for initial connection attempts.
On Recoverable / Retryable Failures
Lyra will only recover or retry on certain failures. By default these include connection errors that are not related to failed authentication, and channel or connection errors that might be the result of temporary network failures. It is possible to see different exceptions for the same failure on different platforms. Because of this, you can freely modify the sets of recoverable and retryable exceptions can as needed to handle any type of failure.
When a channel is closed and is in the process of being recovered, attempts to publish to that channel will result in
AlreadyClosedException being thrown. Publishers should either wait and listen for recovery by way of a ChannelListener, or use a RetryPolicy to retry publish attempts once the channel is recovered.
On Message Delivery
When a channel is closed and recovered, any messages that were delivered but not acknowledged will be redelivered on the newly recovered channel. Attempts to ack/nack/reject messages that were delivered before the channel was recovered are simply ignored since their delivery tags will be invalid for the newly recovered channel.
Note, since channel recovery happens transparently, in effect when a channel is recovered and message redelivery occurs messages may be seen more than once on the recovered channel.
When a Connection or Channel are closed unexpectedly recovery occurs in a background thread. If a retry policy is configured then any invocation attempts will block the calling thread until the Connection/Channel is recovered and the invocation can be retried.
QueueingConsumer is deprecated. Since the current implementation is not recoverable it should not be used with Lyra. Instead it's recommended to extend
DefaultConsumer or implement
Consumer directly. See the JavaDoc for more details.
- JavaDocs are available here.
- The various failure scenarios handled by Lyra are described here.
- See the Lyra cookbook for handling specific RabbitMQ use cases.
- A Clojure wrapper for Lyra is available.
Thanks to Brett Cameron, Michael Klishin and Matthias Radestock for their valuable ideas and feedback.
Copyright 2013-2014 Jonathan Halterman - Released under the Apache 2.0 license.