Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add producing batched messages #112

Closed
Gvain opened this issue May 7, 2014 · 13 comments
Closed

Add producing batched messages #112

Gvain opened this issue May 7, 2014 · 13 comments

Comments

@Gvain
Copy link

Gvain commented May 7, 2014

To improve throughput, i think we should add producing batched messages as rd_kafka_consume_batch() do.
And also producing batched messages is supported by JAVA API.

@edenhill
Copy link
Contributor

edenhill commented May 7, 2014

Good idea!

It should probably take an array of rd_kafka_message_t allowing per-message key, opaques, etc, to be set. This would even allow for per-message partitioning, if that is of interest, even though that would slow things down - so maybe that should be unsupported.

int rd_kafka_produce_batch (rd_kafka_topic_t *rkt, [int32_t partition,] rd_kafka_message_t *rkmessages, int message_cnt, int msgflags);
Returns the number of messages accepted, unaccepted messages will have their 'rkmessages[N].err` value set accordingly.
It would be nice to have per-message msgflags (COPY, FREE) but that would require a whole new struct (since the current ones cant be changed due to ABI restrictions), but this could be left as a FIXME for the API v2 work.

dr_cb, if set, will be called for each message, not for the whole batch.

Sounds good?

@Gvain
Copy link
Author

Gvain commented May 7, 2014

Sounds good.
One more thing. If i set "batch.num.messages", would this have some what the same effect with produce batched messages ?

@edenhill
Copy link
Contributor

edenhill commented May 7, 2014

Internally librdkafka already batches messages sent to the broker, and that is what batch.num.messages controls: how many messages to grab from internal queue and send to broker in one Produce request.

The new produce_batch() API is not related to batch.num.messages, it will simply allow adding multiple messages to the internal queues in one call (and one lock!), which will be a win performance wise.

@Gvain
Copy link
Author

Gvain commented May 7, 2014

Get it. Thanks a lot.

edenhill added a commit that referenced this issue May 9, 2014
This is a high performance interface for enqueuing multiple messages
in one go.
@edenhill
Copy link
Contributor

edenhill commented May 9, 2014

There you go, rd_kafka_produce_batch().
See rdkafka.h for documentation and tests/0011-produce_batch.c for a usage example.

Please verify this on your end.

@edenhill
Copy link
Contributor

Hey @Gvain, have you had time to test the new batch call yet?

@Gvain
Copy link
Author

Gvain commented Jul 2, 2014

@edenhill I am so sorry for the late reply ! The notification email was wrongly ignored by a newly start up filter. The new batch call seems good. Thank you.

@kparadkar
Copy link

Do you have any performance numbers on use of rd_kafka_produce_batch vs rd_kafka_produce? Does it give significant increment in the throughput?

@edenhill
Copy link
Contributor

@kparadkar Expirements have shown that there isn't typically much to gain from using the produce_batch() API.
The improvements it provides over the standard produce() API is mainly when you are producing a large number of messages to the same partition since it will cut down on the number of lock acquisitions from one per message to one per batch.

But, as is always the case with performance measurements, you will need to do your own using your own environment and usage.

@archanapujar
Copy link

How does rd_kafka_produce_batch() handle NULL topic handle. Will it error out with something like EINVAL?

@edenhill
Copy link
Contributor

@archanapujar NULL topic handle is not supported and the produce_batch() call will crash.
It is up to the application to provide valid function arguments.

@archanapujar
Copy link

@edenhill Thanks for the quick response! Does this behavior hold good for all the supported apis?

@edenhill
Copy link
Contributor

Generally yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants