Avoid producing synchronously if at all possible
You should be aware that producing synchronously will reduce maximum throughput dramatically. For example, you might achieve a rate of about 500,000 msg/s if you don't wait on the result of each produce call before sending the next. If you do, you'll get at best ~300 msg/s.
This means you should typically not
ProduceAsync calls, the exception being if you are in a highly concurrent scenario (such as a web request handler). In this case, although an individual call my be slow, it will not impact other calls happening simultaneously.
The two methods are equivalent, but tailored to different usage patterns. The
Produce method is more efficient, and you should care about that if your throughput is high (>~ 20k msgs/s), else the difference will be negligible compared to whatever else you application is doing.
Another difference is that tasks returned by
ProduceAsync complete on thread pool threads, so ordering is not guaranteed. By comparison, delivery report callbacks passed to
Produce are all called on the same thread, so the ordering will exactly match the order they are known to be completed by the client.
Most importantly, you should be checking the result of each produce call. In the case of
ProduceAsync, any error will be exposed in the form of a
KafkaException if you
await the call, else it will show up in the
Exception property (with the
IsFaulted property set to
true). In the case of
Produce, errors are exposed via the
Error property on the
DeliveryReport instance returned in the callback.
Note: when transactions become available in v1.4, you will typically be able to rely on the result of the transaction, commit - won't need to consider the produce delivery reports for errors if you are using transactions.
Events sent to the error event handler and log event handler are typically informational in nature - you can typically ignore them. In very rare cases, events on the error handler may be marked as fatal (
IsFatal set to
true). This can occur if idempotence or transaction semantics cannot be satisfied due to an unlikely scenario on the server (e.g. unexpected log truncation), or bug in the system.
The word 'error', and even 'fatal' are used a fair bit in the logs in benign situations. Don't be easily alarmed by this - if the producer is still operating, these are likely nothing to worry about.
Optimal LingerMs Setting
A commonly adjusted configuration property on the producer is
LingerMs - the minimum time between batches of messages being sent to the cluster. Larger values allow for more batching, which increases throughput. Smaller values may reduce latency. The tradeoff is complicated somewhat as throughput increases though because a smaller LingerMs will result in more broker requests, putting greater load on the server, which in turn may reduce throughput.
A good general purpose setting is 5 (the default is 0.5).
If your throughput is very low, and you really care about latency, you should probably set
The default value for the
Acks configuration property is
All (prior to v1.0, the default was 1). This means that if a delivery report returns without error, the message has been replicated to all replicas in the in-sync replica set.
If you have
EnableIdempotence set to true,
Acks must be all.
You should generally prefer having acks set to all. There's no real benefit to setting it lower. Note that this won't improve end-to-end latency because messages must be replicated to all in-sync replicas before they are available for consumption - you will just get to know whether the message has been successfully written to the leader replica a little earlier.
How can I ensure delivery of messages in the order specified?
true. This has very little overhead - there's not much downside to having this on by default
Before the idempotent producer was available, you could achieve this by setting
MaxInFlight to 1 (at the expense of reduced throughput). Messages are sent in the same order as produced. However, when idempotence is not enabled, in case of failure (temporary network failure for example), Confluent.Kafka will try to resend the data (if the
Retries config property is set to > 0, the default is 2), which can cause reordering (note that this will not happen frequently, only in case of failure).
Custom partitioners are not available in the .NET Client yet. You can get around this simply by producing to specific partitions in your produce call (i.e. partition outside the framework of the client).