Producer OOM precaution #10

ZhuYouzhi · 2013-06-03T06:54:14Z

Hi, Edenhill.

In current librdkafka, if producer produce data at a rate faster than broker can handle, then data will accumulate in rk->rk_op, will may cause OOM (This does happen in my case).

To prevent this, this patch adds a new parameter max_payload_size, if sum of payload exceeds max_payload_size, comming msgs will be dropped.

edenhill · 2013-06-03T07:07:12Z

Hi Zhu!

Thanks for your contribution.

I added a similar setting to the 0.7 branch a couple of weeks ago.
There are two differences:

instead of limiting on total payload size it limits on the total number of messages.
it lets the application know (by -1 return from rd_kafka_produce() and errno ENOBUFS) that it has dropped the message, allowing the application to take appropriate actions, whatever it may be (spool, backpressure, log, ..).

See this commit:
06fd28e

Configuration: producer.max_outq_msg_cnt

I've merged the 0.7 branch down to master, including this commit.
Let me know if the existing solution (limit on messages, not size) is good enough for you.

ZhuYouzhi · 2013-06-03T07:20:47Z

Hi, Edenhill.

"it lets the application know (by -1 return from rd_kafka_produce() and errno ENOBUFS) that it has dropped the message, allowing the application to take appropriate actions, " is cool. Thanks.

There are two little question I'd like to discuss:
(1)Limit only on queue len may be not enough, because msg size may vary. Limit on both queue size and queue length may be a good idea.
(2) As for droppping message(due to exceed max_outq_msg_cnt), current patch can cause memory allocated for payload leaks, if I get it right.

edenhill · 2013-06-03T09:35:41Z

This is true, but it will add the cost of an additional atomic variable barrier/sync.
Dont you think the application can assume an average, or safe, messsage size and use that to calculate max_outq_msg_cnt accordingly?
I dont think your patch leaks memory as op_destroy() will clean up allocated resources.
But the max message size check is done too late, it should be checked before any memory has been allocated, like the max_outq_msg_cnt does.

ZhuYouzhi · 2013-06-03T10:40:42Z

Hi, Edenhill. Thanks for you reply

Fix get/init race for stat counters

prevent oom caused by producer

6a01de5

ZhuYouzhi added 2 commits June 3, 2013 18:21

merge upstream patch

2cf239e

fix memory leak bug

fdda4ea

ZhuYouzhi closed this Jun 4, 2013

xiaojun789 mentioned this pull request Nov 8, 2013

crashes on too large messages #24

Closed

This was referenced Dec 27, 2013

static linking of librdkafka.a needs dynamic pthreads linking #34

Closed

No error callback and hanging when publishing to non existent partition #39

Closed

winbatch mentioned this pull request Feb 11, 2014

Crash on exit #72

Closed

laxpio mentioned this pull request Aug 4, 2014

Crash on invalid partition in Metadata #132

Closed

laxpio mentioned this pull request Oct 9, 2014

Crash on rd_kafka_topic_metadata_update #155

Closed

affranchi mentioned this pull request Mar 26, 2015

Seg fault with multi brokers #224

Closed

mahb94 mentioned this pull request Apr 21, 2015

client consume from kafka crushed occasionally #254

Closed

chenzhanyiczy mentioned this pull request Jul 7, 2015

Deadlock on broker restart #326

Closed

ylgeeker mentioned this pull request Aug 28, 2015

rd_kafka_topic_destroy0 will crash ?? @ 0.8.6 #360

Closed

DEvil0000 mentioned this pull request Oct 12, 2015

rdkafka stopping connection retries on RdKafka::ERR__ALL_BROKERS_DOWN #373

Closed

microwish mentioned this pull request Dec 21, 2015

Suspicious misuse of thread lock #464

Closed

ago3x mentioned this pull request Jan 6, 2016

autocommit offset does not work if i use high level api #480

Closed

coolulu mentioned this pull request Jan 7, 2016

Question of rdkafka_consumer_example_cpp #497

Closed

microwish mentioned this pull request Jan 25, 2016

Consumer segfault @ rd_kafka_broker_metadata_req_op #518

Closed

trthulhu mentioned this pull request Feb 5, 2016

rdkafka crashes when topic is too long (255 characters) sometimes #529

Closed

wucanhui mentioned this pull request Feb 22, 2016

rdkafka_partition.c:165:rd_kafka_toppar_remove: assert: rktp->rktp_removed==0 #558

Closed

trthulhu mentioned this pull request Mar 28, 2016

Assertion `!*"refcnt sub-zero"' failed. #591

Closed

vdeters mentioned this pull request Apr 12, 2016

KafkaConsumer segfault #608

Closed

hendrik-schumacher mentioned this pull request Apr 21, 2016

librdkafka hanging on rd_kafka_destroy() #624

Closed

Benjaad mentioned this pull request Jun 30, 2016

Question of RdKafka: crash as I call rd_kafka_subscribe #710

Closed

mlala-pan mentioned this pull request Aug 5, 2016

Consuming from multiple topics partitions from a single thread causing an assert #748

Closed

mazharshaikh86 mentioned this pull request Apr 19, 2017

0.8.6 : Core generated during metadata query #1184

Closed

7 tasks

DavidLiuXh mentioned this pull request Apr 26, 2017

crash when add new partition #1193

Closed

9 tasks

lathakris mentioned this pull request Jul 7, 2017

Recursive locking on periodic refresh of leader-less partition #1311

Closed

9 tasks

ameyapg mentioned this pull request May 2, 2018

Crash observed consistently while calling begin() ( seems to have an invalid/corrupted iterator) on the vector of pointers to partition metadata of one of the topics #1790

Closed

7 tasks

hulaxiaodai mentioned this pull request May 16, 2018

SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline #162

Closed

vk-coder mentioned this pull request May 16, 2018

threads not getting cleaned up #1803

Closed

7 tasks

TomGitter mentioned this pull request Jun 20, 2018

program crashed for rd_kafka_tppar_t was used after destroy #1846

Closed

7 tasks

blackeyepanda mentioned this pull request Jul 2, 2018

librdkafka hang during reference count problem #1872

Closed

7 tasks

hulaxiaodai mentioned this pull request Jul 16, 2018

Crash when producer get different meta. #1884

Closed

4 tasks

hulaxiaodai mentioned this pull request Nov 7, 2018

【producer】deadlock on metadata requery #2092

Closed

4 tasks

matthrae mentioned this pull request Nov 8, 2018

Assertion `!*"refcnt sub-zero"' after reconnect #2088

Closed

yorksen mentioned this pull request Nov 27, 2018

thread is blocked when stoping the consumer #2119

Closed

7 tasks

voelkerp mentioned this pull request Jan 30, 2019

Broker connect getaddrinfo crash with concurrent producer and consumer #2207

Closed

3 tasks

pradeepp2019 mentioned this pull request Apr 23, 2019

Query on Async programming (Seastar) with librdkafka #2292

Closed

MockingJayWong mentioned this pull request Jul 9, 2019

rd_kafka_topic_new after using admin api #2399

Closed

7 tasks

MockingJayWong mentioned this pull request Aug 2, 2019

rd_kafka_toppar_enq_msg() lock when produce msg #2447

Closed

7 tasks

sarkanyi mentioned this pull request Sep 9, 2019

Recovery takes too long after stopping and restarting a broker in a cluster #2508

Closed

3 tasks

hyhtemple mentioned this pull request Oct 14, 2019

use the produce of the librdkafka c++ interface ,the program report "Program terminated with signal 6, Aborted" #2571

Closed

hyhtemple mentioned this pull request Oct 22, 2019

the program of the Consumer crashed when create topic #2579

Closed

aelanthi mentioned this pull request Dec 16, 2019

confluent python kafka client still crashes after 1.0.1 upgrade #2658

Closed

7 tasks

hyhtemple mentioned this pull request Jun 30, 2020

It is crashed ,when use method of the producer in the thread #2947

Closed

dshivashankar1994 mentioned this pull request Jul 6, 2020

rd_refcnt_sub0: Assertion `!*"refcnt sub-zero"' failed - when the subscribed topic is deleted #2963

Closed

Furuta-Masakazu-quick mentioned this pull request Sep 29, 2020

Producer detects "All broker connections are down" when rolling restarts #3090

Closed

7 tasks

taj-uddin mentioned this pull request Apr 15, 2021

High CPU usage in librdkafka1.4.0 producer #2986

Closed

7 tasks

johncagle mentioned this pull request Sep 22, 2021

segfault in rd_kafka_msgq_set_metadata() from /lib64/librdkafka.so.1 (corrupt TAILQ) #3559

Open

7 tasks

yxuechao007 mentioned this pull request Nov 26, 2021

RdKafka::Producer parses aws kafka broker reporting segment errors when the cluster information is retrieved after a few minutes #3632

Open

yuanhongduan mentioned this pull request Dec 27, 2021

librdkafka deadlock at cnd_timedwait when destroy the consumer #3660

Open

mreouven mentioned this pull request Nov 9, 2023

Thread 15 "rdk:broker-1" received signal SIGSEGV, Segmentation fault. #1054 #4505

Open

kwdubuc referenced this pull request in SolaceDev/librdkafka Apr 2, 2024

Fixes error handling for error responses from STS (#10)

ceb3ee0

azat referenced this pull request in azat-archive/librdkafka Jul 9, 2024

Merge pull request #10 from ilejn/atomic_set_not_init

2d2aab6

Fix get/init race for stat counters

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Producer OOM precaution #10

Producer OOM precaution #10

ZhuYouzhi commented Jun 3, 2013

edenhill commented Jun 3, 2013

ZhuYouzhi commented Jun 3, 2013

edenhill commented Jun 3, 2013

ZhuYouzhi commented Jun 3, 2013

Producer OOM precaution #10

Producer OOM precaution #10

Conversation

ZhuYouzhi commented Jun 3, 2013

edenhill commented Jun 3, 2013

ZhuYouzhi commented Jun 3, 2013

edenhill commented Jun 3, 2013

ZhuYouzhi commented Jun 3, 2013