Skip to content

Jetstream Clustering

rmuthupandian edited this page Feb 11, 2015 · 1 revision

Jetstream provides clustering through its [pub-sub messaging service] (../wiki/Transport-Serivce) or with Kafka queues. When using the pub-sub messaging service, applications can be clustered using the InboundMessagingChannel and OutboundMessagingChannel. This type of clustering is targeted for very low latency use cases and is considerably cheaper than kafka clustering. It uses a push messaging model with at most once delivery semantics. The OutboundMessagingChannel is hosted in the producer application and requires a corresponding InboundMessagingChannel in the consumer application. An application cluster can be deployed across multiple data centers in a cloud environment. Both consumer and producer application clusters must be provisioned to communicate over the same topic. Consumer application clusters are discovered by the producing application clusters through Zookeeper. For inter cluster communication Zookeeper has to be deployed in the system. Zookeeper is only needed for discovery. Once the producers discover the consumers they communicate directly with each other. However if a node fails and comes back up it requires zookeeper to advertise itself.

#Screenshot

Jetstream supports stream affinity through its messaging layer. The event is decorated with a affinity key and passed down to the transport layer. A consistent hashing algorithm is used to create a stream affinity. This guarantees that the stream with a specific affinity key always lands on the same node in the consuming cluster. Jetstream clusters support a self healing feature - if a node goes down traffic directed to consumer cluster is rebalanced by all producers. When a producer detects a slow consumer, it can post advisory messages upstream along with an event that could not be forwarded to the consumer. If a kafka AdviceListener is injected in to the OutboundMessagingChannel, the event can be queued in Kafka and replayed to the consumer cluster as shown below.

#Screenshot

This approach significantly improves the reliability of the transport.

Jetstream also supports clustering using kafka queues. In this approach the streams flow from producers to consumers through Kafka queues. It provides a pull messaging model with at least once delivery semantics.

#Screenshot

This type of clustering is targeted for use cases that require higher levels of delivery guarantees but don't have strict latency requirements. This approach also supports micro batching. Applications can be clustered in this approach by using InboundKafkaChannel and OutboundKafkaChannel. Jetstream's InboundKafkaChannel can handle rebalancing when one of the consuming node goes down. The traffic is now served by a different node.

Clone this wiki locally