Skip to content

WaterDrop Variants

Maciej Mensfeld edited this page May 10, 2024 · 2 revisions

Important

GitHub Wiki is just a mirror of our online documentation.

We highly recommend using our website docs due to Github Wiki limitations. Only some illustrations, links, screencasts, and code examples will work here, and the formatting may be broken.

Please use https://karafka.io/docs.


WaterDrop Variants

WaterDrop variants can manage different configuration settings per topic using the same Kafka producer with shared TCP connections. This feature allows optimal utilization of a producer's TCP connections while enabling tailored dispatch requirements for individual topics. Variants are beneficial when working with topics with varying levels of importance or different throughput and latency requirements.

Creating and Using Variants

To leverage variants in WaterDrop, you initialize a standard producer with default settings that apply broadly to all topics for which you intend to produce messages. Then, you can create variants of this producer with configurations specific to particular topics. These variants allow for topic-specific adjustments without needing multiple producer instances, thus conserving system resources and maintaining high performance.

Variants are created using the #with method. It is critical in enabling topic-specific configurations through variants while using a single producer instance. The #with method is designed to accept two types of arguments:

  • max_wait_timeout: This is a root-scoped setting.
  • topic_config hash: This is where all topic-specific configurations are defined.

Attributes placed inside the topic_config hash during variant creation are referred to as topic_config scoped. Conversely, settings like max_wait_timeout, which reside outside the topic_config hash, are considered root-scoped.

Here's a simple example to demonstrate how to define and use variants with WaterDrop:

# Initialize the main producer with common settings
producer = WaterDrop::Producer.new do |config|
  config.kafka = {
    'bootstrap.servers': 'localhost:9092',
    'acks': '2'  # Default acknowledgment setting for medium-importance topics
  }
end

# Create variants with specific settings
low_importance = producer.with(topic_config: { acks: 1 })
high_importance = producer.with(topic_config: { acks: 'all' })

# Use variants like regular producers
low_importance.produce_async(topic: 'low_priority_events', payload: event.to_json)
high_importance.produce_async(topic: 'critical_events', payload: event.to_json)

Configurable Settings

Variants allow you to modify several Kafka and producer-specific settings to better suit the characteristics of different topics:

Scope Attribute Description
root max_wait_timeout Controls how long the producer waits for the dispatch result before raising an error.
topic_config acks
request.required.acks
Determines the number of broker acknowledgments required before considering a message delivery successful.
topic_config compression.codec
compression.type
Specifies the type of codec used for compression (e.g., none, gzip, snappy, lz4, zstd).
topic_config compression.level Determines the compression level for the selected codec, affecting both the compression ratio and performance.
topic_config delivery.timeout.ms
message.timeout.ms
Limits the time a produced message waits for successful delivery. A time of 0 is infinite.
topic_config partitioner Defines partitioner to use for distribution across partitions within a topic.
topic_config request.timeout.ms The ack timeout of the producer request in milliseconds.

!!! Info "Additional Configuration Attributes Details"

For a more comprehensive list of configuration settings supported by librdkafka, please visit the [Librdkafka Configuration](https://karafka.io/docs/Librdkafka-Configuration/) page.

Edge-Cases and Details

When using variants in WaterDrop, there are specific edge cases and operational nuances that you should be aware of to ensure optimal performance and behavior:

  • Buffering Behavior Across Variants: It is crucial to understand that while topic_config specific settings are preserved per message, the max_wait_timeout applied during the flush operation will correspond to the variant that initiates the flushing. This means that messages from other variants that were buffered may be dispatched using the max_wait_timeout of the variant currently flushing the data. Since variants share a single producer buffer, this can affect how messages are processed.

  • Inconclusive Error Messages: Redefining max_wait_timeout without aligning it with other librdkafka settings can lead to inconclusive error. This issue arises because the timeout settings may not synchronize well with other operational parameters, potentially leading to errors that are difficult to diagnose. For a deeper understanding of this issue and how it might affect your Kafka operations, refer to the Error Handling documentation.

These details are critical in effectively managing and troubleshooting your Kafka message production environment, especially when utilizing the flexibility of variants for different topic configurations.

Conclusion

Variants address the need for dynamic, topic-specific configurations in applications interacting with Kafka. By enabling variations per topic within a single producer, WaterDrop helps streamline resource usage and enhance message dispatch efficiency, making it an essential tool for sophisticated Kafka-based messaging systems.

Clone this wiki locally