Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU Usage caused by librdkafka #1858

Closed
1 of 4 tasks
wuqingjun opened this issue Jun 22, 2018 · 19 comments
Closed
1 of 4 tasks

High CPU Usage caused by librdkafka #1858

wuqingjun opened this issue Jun 22, 2018 · 19 comments
Labels

Comments

@wuqingjun
Copy link

wuqingjun commented Jun 22, 2018

Description - updated

The dump file shows 15 threads, 4 of them shows very long CPU time. I am posting the call stacks of the 4 threads here, if you like, you can also open the attached file.

The symptom is similar to this issue but it should have been fixed: #1569

And I really see the Kafka brokers are down and up at the time this issue happened, but I am unable to reproduce this.

Dump file info, I got this by running ~*k command in windbg:
...

  10  Id: 2780.2cac Suspend: 0 Teb: 0000003d`ad520000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 (Inline Function) --------`-------- librdkafka!rd_clock+0x46 [e:\src\one\oss\librdkafka\src\rdtime.h @ 82] 
01 0000003d`ae7ffa80 00007ffe`ba56ba00 librdkafka!rd_kafka_broker_serve+0x13f [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 2493] 
02 0000003d`ae7ffac0 00007ffe`ba56b021 librdkafka!rd_kafka_broker_ua_idle+0xe0 [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 2552] 
03 0000003d`ae7ffb10 00007ffe`ba5c3f5b librdkafka!rd_kafka_broker_thread_main+0x2e1 [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 3459] 
04 0000003d`ae7ffd70 00007ffe`d3708364 librdkafka!_thrd_wrapper_function+0x1b [e:\src\one\oss\librdkafka\src\tinycthread.c @ 627] 
05 0000003d`ae7ffda0 00007ffe`d40d70b1 kernel32!BaseThreadInitThunk+0x14
06 0000003d`ae7ffdd0 00000000`00000000 ntdll!RtlUserThreadStart+0x21

  11  Id: 2780.3be4 Suspend: 0 Teb: 0000003d`ad522000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 0000003d`ae8ff178 00007ffe`d085795f ntdll!NtWaitForMultipleObjects+0x14
01 0000003d`ae8ff180 00007ffe`d085785e KERNELBASE!WaitForMultipleObjectsEx+0xef
02 0000003d`ae8ff480 00007ffe`ba5c3eab KERNELBASE!WaitForMultipleObjects+0xe
03 0000003d`ae8ff4c0 00007ffe`ba595973 librdkafka!_cnd_timedwait_win32+0x6b [e:\src\one\oss\librdkafka\src\tinycthread.c @ 405] 
04 0000003d`ae8ff4f0 00007ffe`ba595719 librdkafka!rd_kafka_q_pop_serve+0x253 [e:\src\one\oss\librdkafka\src\rdkafka_queue.c @ 380] 
05 0000003d`ae8ff560 00007ffe`ba56a824 librdkafka!rd_kafka_q_pop+0x19 [e:\src\one\oss\librdkafka\src\rdkafka_queue.c @ 412] 
06 (Inline Function) --------`-------- librdkafka!rd_kafka_broker_ops_serve+0x17 [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 2443] 
07 0000003d`ae8ff5a0 00007ffe`ba56ba00 librdkafka!rd_kafka_broker_serve+0x54 [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 2465] 
08 0000003d`ae8ff5e0 00007ffe`ba56b021 librdkafka!rd_kafka_broker_ua_idle+0xe0 [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 2552] 
09 0000003d`ae8ff630 00007ffe`ba5c3f5b librdkafka!rd_kafka_broker_thread_main+0x2e1 [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 3459] 
0a 0000003d`ae8ff890 00007ffe`d3708364 librdkafka!_thrd_wrapper_function+0x1b [e:\src\one\oss\librdkafka\src\tinycthread.c @ 627] 
0b 0000003d`ae8ff8c0 00007ffe`d40d70b1 kernel32!BaseThreadInitThunk+0x14
0c 0000003d`ae8ff8f0 00000000`00000000 ntdll!RtlUserThreadStart+0x21

  12  Id: 2780.14e8 Suspend: 0 Teb: 0000003d`ad524000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 0000003d`ae9ff330 00007ffe`ba5c43a7 ntdll!RtlLeaveCriticalSection+0x29
01 0000003d`ae9ff360 00007ffe`ba591710 librdkafka!mtx_unlock+0x17 [e:\src\one\oss\librdkafka\src\tinycthread.c @ 282] 
02 0000003d`ae9ff390 00007ffe`ba56b9dc librdkafka!rd_kafka_toppar_fetch_decide+0x400 [e:\src\one\oss\librdkafka\src\rdkafka_partition.c @ 1891] 
03 (Inline Function) --------`-------- librdkafka!rd_kafka_broker_toppars_serve+0x1c [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 2515] 
04 0000003d`ae9ff510 00007ffe`ba56b021 librdkafka!rd_kafka_broker_ua_idle+0xbc [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 2549] 
05 0000003d`ae9ff560 00007ffe`ba5c3f5b librdkafka!rd_kafka_broker_thread_main+0x2e1 [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 3459] 
06 0000003d`ae9ff7c0 00007ffe`d3708364 librdkafka!_thrd_wrapper_function+0x1b [e:\src\one\oss\librdkafka\src\tinycthread.c @ 627] 
07 0000003d`ae9ff7f0 00007ffe`d40d70b1 kernel32!BaseThreadInitThunk+0x14
08 0000003d`ae9ff820 00000000`00000000 ntdll!RtlUserThreadStart+0x21

  13  Id: 2780.1e1c Suspend: 0 Teb: 0000003d`ad526000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 0000003d`aeaff568 00007ffe`d40d2263 ntdll!NtQueryPerformanceCounter+0x14
01 0000003d`aeaff570 00007ffe`ba56af1d ntdll!RtlQueryPerformanceCounter+0x53
02 (Inline Function) --------`-------- librdkafka!rd_clock+0xd6 [e:\src\one\oss\librdkafka\src\rdtime.h @ 81] 
03 (Inline Function) --------`-------- librdkafka!rd_interval0+0xd6 [e:\src\one\oss\librdkafka\src\rdinterval.h @ 44] 
04 0000003d`aeaff5a0 00007ffe`ba5c3f5b librdkafka!rd_kafka_broker_thread_main+0x1dd [e:\src\one\oss\librdkafka\src\rdkafka_broker.c @ 3445] 
05 0000003d`aeaff800 00007ffe`d3708364 librdkafka!_thrd_wrapper_function+0x1b [e:\src\one\oss\librdkafka\src\tinycthread.c @ 627] 
06 0000003d`aeaff830 00007ffe`d40d70b1 kernel32!BaseThreadInitThunk+0x14
07 0000003d`aeaff860 00000000`00000000 ntdll!RtlUserThreadStart+0x21
[dumpfileinfo.txt](https://github.com/edenhill/librdkafka/files/2134019/dumpfileinfo.txt)

How to reproduce

Unfortunately I cannot reproduce this issue.

When this issue happened, we have about 40K consumers and 10K producers connected to 10 Kafka Brokers.

IMPORTANT: Always try to reproduce the issue on the latest released version (see https://github.com/edenhill/librdkafka/releases), if it can't be reproduced on the latest version the issue has been fixed.

Checklist

IMPORTANT: We will close issues where the checklist has not been completed.

Please provide the following information:

  • librdkafka version (release number or git tag): 0.11.5
  • Apache Kafka version: 0.10.0
  • librdkafka client configuration: Default librdkafka settings.
  • Operating system: Windows 10 64bit
@edenhill
Copy link
Contributor

Those stacktraces don't make any sense (in the librdkafka parts) unfortunately, it looks like maybe the wrong symbol file was used.

What exact librdkafka version were you using (v0.11.5 is not released yet so that's not a thing)?
Did you use one of the librdkafka.redist packages?

@wuqingjun
Copy link
Author

wuqingjun commented Jun 25, 2018

@edenhill , I corrected the description by updating the call stacks. Please take a look. Sorry for confusion.

We built the library from master branch, that is why the version number shows 0.11.5.

And I believe we're using librdkafka.redist, I am able t find a couple of things like:
librdkafka.redist.nuspec/librdkafka.redistr.props/librdkafka.redist.targets in our project.

This issue seems to be intermittent issue. I cannot reproduce it so far, but in our Prod environment we are able to see it still happening. And the dump file still showing almost the the same thing.

What else I can check?

@wuqingjun
Copy link
Author

@edenhill , I updated the call stack. Please take a look. Thanks!

@edenhill
Copy link
Contributor

Do you set socket.blocking.max.ms ? if so, to what?

Btw, if you are installing the librdkafka.redist nuget package you don't need to build librdkafka yourself, it comes prebuilt.

@edenhill
Copy link
Contributor

Are the 4 tracebacks above from the 4 threads with a lot of CPU usage?
If so, it is kind of weird because 3 of those threads show that they are in ua_idle, which means they don't really do much.

@wuqingjun
Copy link
Author

wuqingjun commented Jun 28, 2018

@edenhill , the socket.blocking.max.ms=1. in terms of the 4 thread call stacks, I don't know the CPU usage of them, the data I provided above is the time the threads has been running, more than 1 hour. The other threads mostly are less than 1 second, here is the data:
Thread Time
11:3be4 0 days 1:15:16.234
12:14e8 0 days 1:06:29.593
13:1e1c 0 days 1:04:30.609
10:2cac 0 days 1:02:32.171
0:40ac 0 days 0:00:13.906
3:2378 0 days 0:00:00.500
2:22a0 0 days 0:00:00.093
7:3cc0 0 days 0:00:00.078
8:40d0 0 days 0:00:00.015
4:4634 0 days 0:00:00.015
15:e4 0 days 0:00:00.000
14:4f10 0 days 0:00:00.000
9:3ea8 0 days 0:00:00.000
6:2af0 0 days 0:00:00.000
5:2940 0 days 0:00:00.000
1:32f8 0 days 0:00:00.000
alldump.txt

@edenhill
Copy link
Contributor

Are you setting socket.blocking.max.ms=1?

@edenhill
Copy link
Contributor

edenhill commented Jul 2, 2018

Are any of the connections to the brokers down?

@wuqingjun
Copy link
Author

@edenhill , to your question "Are any of the connections to the brokers down", the answer is yes and no. We encountered this issue multiple times:

  1. I used to see all 10 brokers were down at the time this issue happened; in these case, 30 threads of librdkafka threads created. socket.blocking.max.ms=1
  2. But when it happened in another test scenario, the brokers were not down. socket.blocking.max.ms=1000

The reference #1762 you gave above sounds reasonable to me, because the socket.blocking.max.ms is set with a very low value, which many cause the 30 threads keep sinning. But this is intermittently happening, we have thousands of machines, but around 10 of them getting this kind of problems.

But for the 2. scenario above, the socket.blocking.max.ms=1000, and there' 4 threads only, this is also causing high CPU. And the issue is intermittently happening. Looks like there would be another root cause.

@edenhill
Copy link
Contributor

edenhill commented Jul 5, 2018

This PR aims to improve on this scenario:
#1870

@edenhill edenhill added the bug label Aug 27, 2018
@edenhill edenhill reopened this Nov 20, 2018
@write2jaydeep
Copy link

write2jaydeep commented Dec 17, 2018

Hi @edenhill it seems you are working on CPU usage problem. let me add problem on my end

Checklist

Please provide the following information:

librdkafka version (release number or git tag): master(downloaded on OCT 2018)
Apache Kafka version: 1.1.1
librdkafka client configuration: queue.buffering.max.ms", "1", "socket.blocking.max.ms", "100"
Operating system: openVOS(Linux like POSIX supported)

Note

As you mentioned in #2143 creating producer objects are costly. still, I am using the same mechanism (create/destroy on each request). I am checking on if it does create "increasing CPU usage" problem

Issue:

On top of librdkafka, we have our own threads which will pickup request from the queue and produce message by calling librdkakfka API.
it seems normal while no request on the queue. when we put a request on the queue, gradually CPU usage has been increasing on each request, and at some level reaches up to 100% CPU usage even by single thread.

Generate issue

I have used rdkafka_simple_producer.c
we can put whole code in while(1) and produce particular message on Kafka server. note here that for now we do recreate/destroy objects on each request.

Thanks
JD

@edenhill
Copy link
Contributor

openVOS is not a supported platform, none of the librdkafka maintainers have ever tried running on openVOS and it is very likely that subtle differences in threads, libc, posix, et.al, may have CPU effects.

Secondly, it is strongly recommended to reuse the producer object, creating a new producer for each message is typically too costly to be practical, and really does not serve much purpose unless the messaging interval is longer than a couple of minutes between messages.

@write2jaydeep
Copy link

@edenhill thank you!
I know openVOS isn't supported, though I am trying. I will dig into internal implementation if I can resolve problems
I am changing stuff to reuse objects. Thanks, man!

@GeorgeShagov
Copy link

GeorgeShagov commented Mar 1, 2019

Hello I have got the similar case, here are the data:

Symptom: 100% CPU Utilization.

  • librdkafka version v0.11.6
  • Apache Kafka version: 2.0.0
  • librdkafka client configuration:
    Here is the producer's config:
    2019-03-01 05:37:31.655 INFO [9468] [main@192] Producer is configured: Topic: prd.cex.gz2.trd.ug; Config: api.version.fallback.ms: 1200000;api.version .request: true;api.version.request.timeout.ms: 10000;auto.commit.interval.ms: 5000;batch.num.messages: 1;broker.address.family: any;broker.address.ttl: 1000;broker.version.fallback: 0.9.0;builtin.features: gzip,snappy,ssl,sasl,regex,lz4,sasl_gssapi,sasl_plain,sasl_scram,plugins;check.crcs: false;clien t.id: rdkafka;compression.codec: none;coordinator.query.interval.ms: 600000;debug: broker,topic;default_topic_conf: 0x5565b968bfb0;delivery.report.only .error: false;enable.auto.commit: true;enable.auto.offset.store: true;enable.partition.eof: true;enabled_events: 0;fetch.error.backoff.ms: 500;fetch.ma x.bytes: 52428800;fetch.message.max.bytes: 1048576;fetch.min.bytes: 1;fetch.wait.max.ms: 100;group.protocol.type: consumer;heartbeat.interval.ms: 300;i nternal.termination.signal: 0;log.connection.close: false;log.queue: false;log.thread.name: true;log_cb: 0x7f1ac5ad0580;log_level: 6;max.in.flight.requ ests.per.connection: 1000000;message.copy.max.bytes: 65535;message.max.bytes: 1000000;message.send.max.retries: 2;metadata.broker.list: 192.168.10.24:9 092,192.168.10.25:9092,192.168.10.26:9092;metadata.max.age.ms: -1;metadata.request.timeout.ms: 60000;offset.store.method: broker;open_cb: 0x7f1ac5af6f1 0;partition.assignment.strategy: range,roundrobin;queue.buffering.backpressure.threshold: 1;queue.buffering.max.kbytes: 1048576;queue.buffering.max.mes sages: 100000;queue.buffering.max.ms: 0;queued.max.messages.kbytes: 1048576;queued.min.messages: 100000;receive.message.max.bytes: 100000000;reconnect. backoff.jitter.ms: 500;retry.backoff.ms: 100;sasl.kerberos.kinit.cmd: kinit -S "%{sasl.kerberos.service.name}/%{broker.name}" -k -t "%{sasl.kerberos.ke ytab}" %{sasl.kerberos.principal};sasl.kerberos.min.time.before.relogin: 60000;sasl.kerberos.principal: kafkaclient;sasl.kerberos.service.name: kafka;s asl.mechanisms: GSSAPI;security.protocol: plaintext;session.timeout.ms: 30000;socket.blocking.max.ms: 1000;socket.keepalive.enable: true;socket.max.fai ls: 1;socket.nagle.disable: false;socket.receive.buffer.bytes: 0;socket.send.buffer.bytes: 0;socket.timeout.ms: 60000;socket_cb: 0x7f1ac5ae4890;statist ics.interval.ms: 0;topic.metadata.refresh.fast.cnt: 10;topic.metadata.refresh.fast.interval.ms: 250;topic.metadata.refresh.interval.ms: 300000;topic.me tadata.refresh.sparse: true;; Builder topic:

Consumer diffs:

[producer.common]
metadata.broker.list=192.168.10.24:9092,192.168.10.25:9092,192.168.10.26:9092
debug=broker,topic
acks=1
batch.size=1
batch.num.messages=1
linger.ms=0
#max.request.size=10
#enable.idempotance=1
#enable.gapless.guarantee=1
partition=-1

[consumer.common]
metadata.broker.list=192.168.10.24:9092,192.168.10.25:9092,192.168.10.26:9092
debug=consumer,topic
batch.size=1
batch.num.messages=10
partition=-1
group=me2
enable.auto.commit=0

  • Operating system: Ubuntu 18.04
  • Provide logs

y
%4|1551416433.504|REQTMOUT|rdkafka#producer-4| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: Timed out 0 in-flight, 1 retry-queued
, 0 out-queue, 0 partially-sent requests
%7|1551416433.505|REQERR|rdkafka#producer-4| [thrd:main]: 192.168.10.26:9092/bootstrap: MetadataRequest failed: Local: Timed out in queue: actions Retr
y
%7|1551416434.316|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.12 partition 0 Leader 3
%7|1551416434.316|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.11 partition 0 Leader 2
%7|1551416434.316|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.13 partition 0 Leader 2
%7|1551416434.316|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.10 partition 0 Leader 2
%7|1551416434.316|METADATA|rdkafka#producer-3| [thrd:main]: prod-kafka-instance-3:9092/3: 4/4 requested topic(s) seen in metadata
%7|1551416434.316|METADATA|rdkafka#producer-1| [thrd:main]: Topic prd.cex.bz2.trd.ug.12 partition 0 Leader 2
%7|1551416434.316|METADATA|rdkafka#producer-1| [thrd:main]: Topic prd.cex.bz2.trd.ug.11 partition 0 Leader 1
%7|1551416434.316|METADATA|rdkafka#producer-1| [thrd:main]: Topic prd.cex.bz2.trd.ug.13 partition 0 Leader 3
%7|1551416434.316|METADATA|rdkafka#producer-1| [thrd:main]: Topic prd.cex.bz2.trd.ug.10 partition 0 Leader 2
%7|1551416434.316|METADATA|rdkafka#producer-1| [thrd:main]: prod-kafka-instance-1:9092/1: 4/4 requested topic(s) seen in metadata
%7|1551416434.322|BROKERFAIL|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: failed: err: Local: Broker transpor
t failure: (errno: Connection reset by peer)
%7|1551416434.322|FAIL|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: Disconnected (after 1199973ms in state UP
)
%7|1551416434.322|STATE|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: Broker changed state UP -> DOWN
%7|1551416434.322|CONNECT|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: broker in state DOWN connecting
%7|1551416434.322|CONNECT|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: Connecting to ipv4#192.168.10.25:9092
(plaintext) with socket 91
%7|1551416434.322|STATE|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: Broker changed state DOWN -> CONNECT
%7|1551416434.322|CONNECT|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: Connected to ipv4#192.168.10.25:9092
%7|1551416434.322|CONNECTED|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: Connected (#616)
%7|1551416434.322|STATE|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: Broker changed state CONNECT -> APIVERSI
ON_QUERY
%7|1551416434.322|METADATA|rdkafka#producer-4| [thrd:main]: Topic prd.cex.gz2.trd.ug.12 partition 0 Leader 1
%7|1551416434.322|METADATA|rdkafka#producer-4| [thrd:main]: Topic prd.cex.gz2.trd.ug.11 partition 0 Leader 3
%7|1551416434.322|METADATA|rdkafka#producer-4| [thrd:main]: Topic prd.cex.gz2.trd.ug.13 partition 0 Leader 3
%7|1551416434.322|METADATA|rdkafka#producer-4| [thrd:main]: Topic prd.cex.gz2.trd.ug.10 partition 0 Leader 1
%7|1551416434.322|METADATA|rdkafka#producer-4| [thrd:main]: prod-kafka-instance-2:9092/2: 4/4 requested topic(s) seen in metadata
%7|1551416434.322|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.12 partition 0 Leader 3
%7|1551416434.322|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.11 partition 0 Leader 2
%7|1551416434.322|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.13 partition 0 Leader 2
%7|1551416434.322|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.10 partition 0 Leader 2
%7|1551416434.322|METADATA|rdkafka#producer-3| [thrd:main]: prod-kafka-instance-1:9092/1: 4/4 requested topic(s) seen in metadata
%7|1551416434.322|STATE|rdkafka#producer-3| [thrd:192.168.10.25:9092/bootstrap]: 192.168.10.25:9092/bootstrap: Broker changed state APIVERSION_QUERY ->
UP
%7|1551416434.323|METADATA|rdkafka#producer-2| [thrd:main]: Topic prd.cex.gz1.trd.ug.12 partition 0 Leader 2
%7|1551416434.323|METADATA|rdkafka#producer-2| [thrd:main]: Topic prd.cex.gz1.trd.ug.11 partition 0 Leader 3
%7|1551416434.323|METADATA|rdkafka#producer-2| [thrd:main]: Topic prd.cex.gz1.trd.ug.13 partition 0 Leader 2
%7|1551416434.323|METADATA|rdkafka#producer-2| [thrd:main]: Topic prd.cex.gz1.trd.ug.10 partition 0 Leader 3
%7|1551416434.323|METADATA|rdkafka#producer-2| [thrd:main]: prod-kafka-instance-1:9092/1: 4/4 requested topic(s) seen in metadata
%7|1551416434.392|BROKERFAIL|rdkafka#producer-3| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: failed: err: Local: Broker transpor
t failure: (errno: Connection reset by peer)
%7|1551416434.392|FAIL|rdkafka#producer-3| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Disconnected (after 1200042ms in state UP
)
%7|1551416434.392|STATE|rdkafka#producer-3| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Broker changed state UP -> DOWN
%7|1551416434.392|CONNECT|rdkafka#producer-3| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: broker in state DOWN connecting
%7|1551416434.392|CONNECT|rdkafka#producer-3| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Connecting to ipv4#192.168.10.24:9092
(plaintext) with socket 101
%7|1551416434.392|STATE|rdkafka#producer-3| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Broker changed state DOWN -> CONNECT
%7|1551416434.393|BROKERFAIL|rdkafka#producer-1| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: failed: err: Local: Broker transpor
t failure: (errno: Connection reset by peer)
%7|1551416434.393|CONNECT|rdkafka#producer-3| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Connected to ipv4#192.168.10.24:9092
%7|1551416434.393|CONNECTED|rdkafka#producer-3| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Connected (#603)
%7|1551416434.393|FAIL|rdkafka#producer-1| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Disconnected (after 2400065ms in state UP
)
%7|1551416434.393|BROKERFAIL|rdkafka#producer-2| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: failed: err: Local: Broker transpor
t failure: (errno: Connection reset by peer)
%7|1551416434.393|STATE|rdkafka#producer-1| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Broker changed state UP -> DOWN
%7|1551416434.393|FAIL|rdkafka#producer-2| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: Disconnected (after 1200043ms in state UP
)
%7|1551416434.393|STATE|rdkafka#producer-2| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: Broker changed state UP -> DOWN
%7|1551416434.393|CONNECT|rdkafka#producer-2| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: broker in state DOWN connecting
%7|1551416434.393|BROKERFAIL|rdkafka#producer-2| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: failed: err: Local: Broker transpor
t failure: (errno: Connection reset by peer)
%7|1551416434.393|FAIL|rdkafka#producer-2| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Disconnected (after 600019ms in state UP)
%7|1551416434.393|STATE|rdkafka#producer-2| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Broker changed state UP -> DOWN
%7|1551416434.393|CONNECT|rdkafka#producer-2| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: Connecting to ipv4#192.168.10.26:9092
(plaintext) with socket 37
%7|1551416434.393|STATE|rdkafka#producer-2| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: Broker changed state DOWN -> CONNECT
%7|1551416434.393|STATE|rdkafka#producer-3| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Broker changed state CONNECT -> APIVERSI
ON_QUERY
%7|1551416434.393|CONNECT|rdkafka#producer-2| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: broker in state DOWN connecting
%7|1551416434.393|CONNECT|rdkafka#producer-1| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: broker in state DOWN connecting
%7|1551416434.393|CONNECT|rdkafka#producer-2| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: Connected to ipv4#192.168.10.26:9092
%7|1551416434.393|CONNECTED|rdkafka#producer-2| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: Connected (#616)
%7|1551416434.393|STATE|rdkafka#producer-2| [thrd:192.168.10.26:9092/bootstrap]: 192.168.10.26:9092/bootstrap: Broker changed state CONNECT -> APIVERSI
ON_QUERY
%7|1551416434.393|CONNECT|rdkafka#producer-2| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Connecting to ipv4#192.168.10.24:9092
(plaintext) with socket 49
%7|1551416434.393|STATE|rdkafka#producer-2| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Broker changed state DOWN -> CONNECT
%7|1551416434.393|CONNECT|rdkafka#producer-1| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Connecting to ipv4#192.168.10.24:9092
(plaintext) with socket 69
%7|1551416434.393|STATE|rdkafka#producer-1| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Broker changed state DOWN -> CONNECT
%7|1551416434.393|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.12 partition 0 Leader 3
%7|1551416434.393|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.11 partition 0 Leader 2
%7|1551416434.393|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.13 partition 0 Leader 2
%7|1551416434.393|METADATA|rdkafka#producer-3| [thrd:main]: Topic prd.cex.dz1.trd.ug.10 partition 0 Leader 2
%7|1551416434.393|CONNECT|rdkafka#producer-2| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Connected to ipv4#192.168.10.24:9092
%7|1551416434.393|CONNECTED|rdkafka#producer-2| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Connected (#605)
%7|1551416434.393|STATE|rdkafka#producer-2| [thrd:192.168.10.24:9092/bootstrap]: 192.168.10.24:9092/bootstrap: Broker changed state CONNECT -> APIVERSI
ON_QUERY

Note: I can do archive, on request

  • Provide broker log excerpts

Working on this

  • Critical issue

gcore:

[Current thread is 1 (Thread 0x7f62fbfadf40 (LWP 23269))]
(gdb) info threads
Id Target Id Frame

  • 1 Thread 0x7f62fbfadf40 (LWP 23269) 0x00007f62fbb86dbb in __pthread_cond_timedwait (cond=cond@entry=0x55cf7060e038, mutex=mutex@entry=0x55cf7060e010,
    abstime=abstime@entry=0x7ffc7229a640) at pthread_cond_wait.c:668
    2 Thread 0x7f62f828a700 (LWP 23342) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62f82884a0, expected=0,
    futex_word=0x55cf705f5940) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    3 Thread 0x7f62f7a89700 (LWP 23344) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62f7a871b0, expected=0,
    futex_word=0x55cf705f6580) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    4 Thread 0x7f62f7288700 (LWP 23345) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62e0003f28, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    5 Thread 0x7f62f6a87700 (LWP 23346) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f6274008c18, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    6 Thread 0x7f62f6286700 (LWP 23347) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f626c008188, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    7 Thread 0x7f62f5a85700 (LWP 23352) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62f5a834a0, expected=0,
    futex_word=0x55cf705fa4c0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    8 Thread 0x7f62f5284700 (LWP 23354) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62f52821b0, expected=0,
    futex_word=0x55cf705f9fb0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    9 Thread 0x7f62f4a83700 (LWP 23355) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62940720f8, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    10 Thread 0x7f62effff700 (LWP 23356) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f6284003d38, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    11 Thread 0x7f62ef7fe700 (LWP 23357) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f627c006148, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    12 Thread 0x7f62eeffd700 (LWP 23358) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62eeffb4a0, expected=0,
    futex_word=0x55cf70603e04) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    13 Thread 0x7f62ee7fc700 (LWP 23359) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62ee7fa1b0, expected=0,
    futex_word=0x55cf70602f10) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    14 Thread 0x7f62edffb700 (LWP 23360) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62cc0042d8, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    15 Thread 0x7f62ed7fa700 (LWP 23361) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62c4005788, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    16 Thread 0x7f62ecff9700 (LWP 23362) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62d8006f18, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    17 Thread 0x7f62ec7f8700 (LWP 23364) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62ec7f64a0, expected=0,
    futex_word=0x55cf706097a0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    18 Thread 0x7f62ebff7700 (LWP 23365) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62ebff51b0, expected=0,
    futex_word=0x55cf70609120) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    19 Thread 0x7f62eb7f6700 (LWP 23366) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62b8004428, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    20 Thread 0x7f62eaff5700 (LWP 23367) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62a40040c8, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    21 Thread 0x7f62ea7f4700 (LWP 23368) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f628c0048b8, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    22 Thread 0x7f62d2ffd700 (LWP 23374) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f6264005168, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    23 Thread 0x7f62d3fff700 (LWP 23375) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f625c004248, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    24 Thread 0x7f62d1ffb700 (LWP 23378) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f6254004088, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    25 Thread 0x7f62d17fa700 (LWP 23379) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62600062a8, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    26 Thread 0x7f62d0ff9700 (LWP 23380) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f6258003c48, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    27 Thread 0x7f62b3fff700 (LWP 23467) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f624c003b68, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    28 Thread 0x7f62b37fe700 (LWP 23468) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x55cf70d86f38, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    29 Thread 0x7f62b27fc700 (LWP 23476) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f6260006078, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    30 Thread 0x7f62b0ff9700 (LWP 23496) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62c8003c18, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    31 Thread 0x7f62a3fff700 (LWP 23502) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f6270001d98, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    32 Thread 0x7f62a37fe700 (LWP 23511) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62680038a8, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    33 Thread 0x7f62a2ffd700 (LWP 23512) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f625c006278, nfds=2, timeout=999) at ../sysdeps/unix/sysv/linux/poll.c:29
    34 Thread 0x7f62e9ff3700 (LWP 23549) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62e9ff14a0, expected=0,
    futex_word=0x55cf7060d724) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    35 Thread 0x7f62e97f2700 (LWP 23550) 0x00007f62fbb86f85 in futex_abstimed_wait_cancelable (private=, abstime=0x7f62e97f01b0, expected=0,
    futex_word=0x55cf7060d020) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    36 Thread 0x7f62b17fa700 (LWP 23551) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f626c0078f8, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    37 Thread 0x7f62b1ffb700 (LWP 23552) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62740082d8, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    38 Thread 0x7f62e8ff1700 (LWP 23553) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62d80071f8, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    39 Thread 0x7f62d37fe700 (LWP 23554) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f629c003648, nfds=2, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:29
    40 Thread 0x7f62d27fc700 (LWP 23555) 0x00007f62fa7aebf9 in __GI___poll (fds=fds@entry=0x7f62a80016d8, nfds=2, timeout=1001) at ../sysdeps/unix/sysv/linux/poll.c:29
    41 Thread 0x7f62b2ffd700 (LWP 23556) __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
    (gdb)
    7 gdb

@GeorgeShagov
Copy link

GeorgeShagov commented Mar 8, 2019

Hello
It happened again two times, here is more detailed report, the main thread:

(gdb) where
#0 0x00007f62fbb86dbb in __pthread_cond_timedwait (cond=cond@entry=0x55cf7060e038, mutex=mutex@entry=0x55cf7060e010, abstime=abstime@entry=0x7ffc7229a640) at pthread_cond_wait.c:668
#1 0x00007f62fa408259 in cnd_timedwait (cond=cond@entry=0x55cf7060e038, mtx=mtx@entry=0x55cf7060e010, ts=ts@entry=0x7ffc7229a640) at tinycthread.c:469
#2 0x00007f62fa408663 in cnd_timedwait_abs (cnd=cnd@entry=0x55cf7060e038, mtx=mtx@entry=0x55cf7060e010, tspec=tspec@entry=0x7ffc7229a640) at tinycthread_extra.c:100
#3 0x00007f62fa3d0e1f in rd_kafka_q_serve_rkmessages (rkq=0x55cf7060e010, timeout_ms=, rkmessages=0x55cf70d90b20, rkmessages_size=10) at rdkafka_queue.c:532
#4 0x00007f62fa39cbac in rd_kafka_consume_batch0 (rkmessages_size=, rkmessages=, timeout_ms=, rkq=) at rdkafka.c:2075
#5 rd_kafka_consume_batch_queue (rkqu=, timeout_ms=, rkmessages=, rkmessages_size=) at rdkafka.c:2121
#6 0x00007f62fb5c5305 in cppkafka::Consumer::poll_batch(unsigned long, std::chrono::duration<long, std::ratio<1l, 1000l> >) () from /home/cexprd/lib/libcppkafka.so.0.2
#7 0x000055cf701b3089 in g_msgLoop<g_normal(const string&, cppkafka::Configuration&, ProducersCollection&)::<lambda(const string&, ProducersCollection&, int&)>, int>(cppkafka::Consumer &, uint32_t, ProducersCollection &, int &) (consumer=..., batch_sz=batch_sz@entry=10, producers=std::vector of length 4, capacity 4 = {...}, Context=, fn=...) at src/./common/msg_loop.h:122
#8 0x000055cf701b5ec3 in g_normal (topic_name="prd.cex.pz1.ord.tg.12", config_consumer=..., producers=std::vector of length 4, capacity 4 = {...}) at src/normal.cpp:45
#9 0x000055cf701b0707 in main (argc=, argv=) at src/main.cpp:2

33

It seems like this is a deadloop:

#3 0x00007f62fa3d0e1f in rd_kafka_q_serve_rkmessages (rkq=0x55cf7060e010, timeout_ms=, rkmessages=0x55cf70d90b20, rkmessages_size=10) at rdkafka_queue.c:532
532 cnd_timedwait_abs(&rkq->rkq_cond, &rkq->rkq_lock,

from rdkafka_queue.c:532

530
531 while (!(rko = TAILQ_FIRST(&rkq->rkq_q)) &&
532 cnd_timedwait_abs(&rkq->rkq_cond, &rkq->rkq_lock,
533 &timeout_tspec) != thrd_timedout)
534 ;
535

print outs:

(gdb) print rkq->rkq_q
$6 = {tqh_first = 0x0, tqh_last = 0x55cf7060e070}
(gdb) print rko
$7 = (rd_kafka_op_t *) 0x0
(gdb) print rkq
$8 = (rd_kafka_q_t *) 0x55cf7060e010
(gdb) print *rkq
$9 = {rkq_lock = {__data = {__lock = 2, __count = 0, __owner = 23269, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\002\000\000\000\000\000\000\000\345Z\000\000\001", '\000' <repeats 26 times>, __align = 2}, rkq_cond = {__data = {{__wseq = 308019789, __wseq32 = {__low = 308019789, __high = 0}}, {
__g1_start = 308019705, __g1_start32 = {__low = 308019705, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 4294967287}, __g1_orig_size = 132, __wrefs = 0, __g_signals = {0, 0}},
__size = "M\002\\022\000\000\000\000\371\001\\022", '\000' <repeats 16 times>, "\367\377\377\377\204", '\000' <repeats 14 times>, __align = 308019789}, rkq_fwdq = 0x0, rkq_q = {tqh_first = 0x0,
tqh_last = 0x55cf7060e070}, rkq_qlen = 0, rkq_qsize = 0, rkq_refcnt = 4, rkq_flags = 3, rkq_rk = 0x55cf70945350, rkq_qio = 0x0, rkq_serve = 0x0, rkq_opaque = 0x0,
rkq_name = 0x7f62fa467740 <FUNCTION.26527> "rd_kafka_cgrp_new"}
(gdb) print timeout_tspec
$10 = {tv_sec = 1551416087, tv_nsec = 1000000000}
(gdb) print thrd_timedout
No symbol "thrd_timedout" in current context.
(gdb) print timeout_ms
$11 =

@GeorgeShagov
Copy link

I can provide a coredump on demand, pls help!

@GeorgeShagov
Copy link

#1 0x00007f62fa408259 in cnd_timedwait (cond=cond@entry=0x55cf7060e038, mtx=mtx@entry=0x55cf7060e010, ts=ts@entry=0x7ffc7229a640) at tinycthread.c:469
469 ret = pthread_cond_timedwait(cond, mtx, ts);
(gdb) print *cond
$17 = {__data = {{__wseq = 308019789, __wseq32 = {__low = 308019789, __high = 0}}, {__g1_start = 308019705, __g1_start32 = {__low = 308019705, __high = 0}}, __g_refs = {0, 0}, __g_size = {0,
4294967287}, __g1_orig_size = 132, __wrefs = 0, __g_signals = {0, 0}},
__size = "M\002\\022\000\000\000\000\371\001\\022", '\000' <repeats 16 times>, "\367\377\377\377\204", '\000' <repeats 14 times>, __align = 308019789}
(gdb) print *mtx
$18 = {__data = {__lock = 2, __count = 0, __owner = 23269, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\002\000\000\000\000\000\000\000\345Z\000\000\001", '\000' <repeats 26 times>, __align = 2}
(gdb) print *ts
$19 = {tv_sec = 1551416087, tv_nsec = 1000000000}

@edenhill
Copy link
Contributor

(gdb) print *ts
$10 = {tv_sec = 1551416087, tv_nsec = 1000000000}`

I believe that is the problem, the nanosecond field is >= 1s.
This was fixed in #2108 by @boatfish : 672dbf4

We'll have a v1.0.0 release shortly, suggest you try out the latest RC in the meantime: v1.0.0-RC7

@raj1003
Copy link

raj1003 commented Jul 12, 2019

@wuqingjun

40K consumer - Just curious -are is broker able to scale to 40K consumer in a consumer group?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants