Memory usage of librdkafka #3343

YarinLowe · 2021-04-11T11:03:47Z

Description

Hi,
I'm running several librdkafka instances on several machines, connecting to several clusters (each client connects to one cluster).
I found out that librdkafka's clients (producer and consumer) use a large amount of memory:
For an instance connecting to a cluster of 3 brokers, thus running 4-5 threads, there's a memory usage of 400-800MB - which is really problematic when I need to run several instances on a single machine (e.g. on different applications).
The memory usage is high even before connecting to the cluster (proved by running the client with dummy non-existing bootstrap servers).
I tried to 'play' with the configuration (e.g. lowering queue.buffering.max.kbytes) but nothing seemed to help (or even to have any effect).
I tried to configure some dummy bootstrap servers and found out that each server I add to the list (even if does not exist) adds 80-150MB of memory (and another broker thread, of course).
Connecting to different clusters results in varied memory usage. I have some instances that use 400MB and others that use 800MB. There's no significant difference between them - they are all "idle" producers, sending a message each 3 seconds via 2 topics only - the main diff is they're connecting to different clusters.

To sum up the questions:

Is it the expected amount of memory? (for an instance connecting to a cluster of 3 brokers)
What affects the memory usage? Is it the configuration (apart from bootstrap list)? Does the server [cluster] has any effect? (e.g. number of topics, even when not intentionally-used locally)
Is there any way to reduce the memory usage?

Thanks!

How to reproduce

Simply use a default configuration object and initialize a producer/consumer.

Checklist

Please provide the following information:

librdkafka version: v1.5.2
Apache Kafka version: 2.7.0
librdkafka client configuration: default
Operating system: Ubuntu 16.04
Provide logs (with debug=.. as necessary) from librdkafka
Provide broker log excerpts
Critical issue

The text was updated successfully, but these errors were encountered:

edenhill · 2021-04-12T07:49:55Z

No, an idle librdkafka instance should not consume much memory, guessing less than a meg.

Is this a producer instance? If so; are you producing messages that could correlate to the memory size?

Or is it a consumer? If so; it could be the pre-fetch queues that are filling up.

edenhill · 2021-04-12T07:50:50Z

You could try to run your application with valgrind, and when the memory size is large kill the application to get a leak report (which should then show all active allocations)

YarinLowe · 2021-04-12T07:54:02Z

@edenhill, most of my checks were with a producer instance - and it was using a lot of memory before even connecting to the cluster and before any messages were produced, so I can't find any correlation with anything.
I also tried running with valgrind, but encountered some issues - will try to run it again.

YarinLowe · 2021-04-12T12:52:20Z

I found out that only the VSZ (virtual size) of the process got a big rise when creating a producer.
The physical memory usage (RSS - resident set size) had, as you say, a minor addition of ~1MB (and it seems to stay like that).
So - the problem, I think, is much less critical than I thought - but is it really fine (and expected) that librdkafka has a VSZ of a few hundreds MB? (and configuration changes didn't have any effect)

zhangwen-network · 2021-04-13T03:13:57Z

@YarinLowe I have the same problem with you. Run the test pruducer on my device which is arm-based, the VSZ is 220M. It's unacceptable for my device, so i also want to figure out why and how to reduce it.

edenhill · 2021-04-13T07:10:23Z

Running an idle kafkacat producer connected to a cluster with 3 known brokers, we look at the process memory map:

$ pmap $(pidof kafkacat)  | cut -d ' ' -f 2- | sort -nr | head -20
 65404K -----   [ anon ]
 65404K -----   [ anon ]
 65404K -----   [ anon ]
 65404K -----   [ anon ]
 65404K -----   [ anon ]
  8192K rw---   [ anon ]
  8192K rw---   [ anon ]
  8192K rw---   [ anon ]
  8192K rw---   [ anon ]
  8192K rw---   [ anon ]
  8192K rw---   [ anon ]
  8192K rw---   [ anon ]
  8192K rw---   [ anon ]
  1644K r-x-- libcrypto.so.1.1
  1504K r-x-- libc-2.31.so
  1288K r-x-- libdb-5.3.so
  1244K r---- libunistring.so.2.1.0
  1228K r-x-- librdkafka.so.1
  1160K r-x-- libgnutls.so.30.27.0
 ...

That's 8 anonymous 8MB allocations and 5 anonymous 64MB allocations.

Let's get the thread count:

$ top -b -n 1 -H -p $(pidof kafkacat) 
top - 09:02:19 up 1 day, 2 min,  1 user,  load average: 0,92, 0,61, 0,52
Trådar:   9 totalt,   0 körande,   9 sovande,   0 stoppade,   0 zombie
%Cpu/er:  2,5 an,  0,8 sy,  0,0 ni, 96,7 in,  0,0 vä,  0,0 ha,  0,0 ma,  0,0 st
MiB Minn :  32075,9 totalt,   9159,0 fritt,   9362,2 anv.,  13554,6 buff/cache
MiB Växl:  30864,0 totalt,  30864,0 fritt,      0,0 anv.,  21889,0 tillg Minn 

    PID ANVÄNDAR  PR  NI    VIRT    RES   DELT S  %CPU  %MIN      TID+ KOMMANDO
 115032 maglun    20   0  421372  11828  10180 S   0,0   0,0   0:00.00 kafkacat
 115033 maglun    20   0  421372  11828  10180 S   0,0   0,0   0:00.62 rdk:main
 115034 maglun    20   0  421372  11828  10180 S   0,0   0,0   0:00.01 rdk:broker-1
 115035 maglun    20   0  421372  11828  10180 S   0,0   0,0   0:00.01 rdk:broker-1
 115036 maglun    20   0  421372  11828  10180 S   0,0   0,0   0:00.02 rdk:broker-1
 115037 maglun    20   0  421372  11828  10180 S   0,0   0,0   0:00.01 rdk:broker-1
 115038 maglun    20   0  421372  11828  10180 S   0,0   0,0   0:00.00 rdk:broker5
 115039 maglun    20   0  421372  11828  10180 S   0,0   0,0   0:00.01 rdk:broker3
 115040 maglun    20   0  421372  11828  10180 S   0,0   0,0   0:00.00 rdk:broker4

8 threads. (see https://github.com/edenhill/librdkafka/wiki/FAQ#number-of-internal-threads)

Now let's try that again with kafkacat just knowing about one broker:

$ pmap $(pidof kafkacat)  | cut -d ' ' -f 2- | sort -nr | head -20
 65404K -----   [ anon ]
 65404K -----   [ anon ]
  8192K rw---   [ anon ]
  8192K rw---   [ anon ]
  8192K rw---   [ anon ]
  1644K r-x-- libcrypto.so.1.1
  1504K r-x-- libc-2.31.so
  1288K r-x-- libdb-5.3.so
  1244K r---- libunistring.so.2.1.0
  1228K r-x-- librdkafka.so.1

3 x 8MB, 2 x 64 MB.

$ top -b -n 1 -H -p $(pidof kafkacat) 
top - 09:03:38 up 1 day, 4 min,  1 user,  load average: 0,50, 0,56, 0,51
Trådar:   4 totalt,   0 körande,   4 sovande,   0 stoppade,   0 zombie
%Cpu/er:  5,0 an,  0,0 sy,  0,0 ni, 95,0 in,  0,0 vä,  0,0 ha,  0,0 ma,  0,0 st
MiB Minn :  32075,9 totalt,   9157,3 fritt,   9363,9 anv.,  13554,7 buff/cache
MiB Växl:  30864,0 totalt,  30864,0 fritt,      0,0 anv.,  21887,5 tillg Minn 

    PID ANVÄNDAR  PR  NI    VIRT    RES   DELT S  %CPU  %MIN      TID+ KOMMANDO
 116456 maglun    20   0  183784  11980  10336 S   0,0   0,0   0:00.01 kafkacat
 116457 maglun    20   0  183784  11980  10336 S   0,0   0,0   0:00.00 rdk:main
 116458 maglun    20   0  183784  11980  10336 S   0,0   0,0   0:00.00 rdk:broker-1
 116459 maglun    20   0  183784  11980  10336 S   0,0   0,0   0:00.00 rdk:broker-1

for 3 threads.

The default per thread stack size on Linux is 8 MB, so the 8MB allocations are most likely per-thread stacks.

The 64 MB allocations on the other hand are proportionate but not equal to the number of threads, what we could be seeing
is on-demand per-thread heap (malloc) space.

edenhill · 2021-04-13T07:24:29Z

Calling malloc_info(3) might strengthen this assumption, this is from the run with 8 threads and five 64 MB chunks. See the <heap nr=..> tags:

<malloc version="1">
<heap nr="0">
<sizes>
  <unsorted from="145" to="145" total="145" count="1"/>
</sizes>
<total type="fast" count="0" size="0"/>
<total type="rest" count="2" size="107777"/>
<system type="current" size="417792"/>
<system type="max" size="417792"/>
<aspace type="total" size="417792"/>
<aspace type="mprotect" size="417792"/>
</heap>
<heap nr="1">
<sizes>
</sizes>
<total type="fast" count="0" size="0"/>
<total type="rest" count="1" size="124832"/>
<system type="current" size="135168"/>
<system type="max" size="135168"/>
<aspace type="total" size="135168"/>
<aspace type="mprotect" size="135168"/>
<aspace type="subheaps" size="1"/>
</heap>
<heap nr="2">
<sizes>
</sizes>
<total type="fast" count="0" size="0"/>
<total type="rest" count="1" size="132272"/>
<system type="current" size="135168"/>
<system type="max" size="135168"/>
<aspace type="total" size="135168"/>
<aspace type="mprotect" size="135168"/>
<aspace type="subheaps" size="1"/>
</heap>
<heap nr="3">
<sizes>
  <unsorted from="1569" to="1569" total="1569" count="1"/>
</sizes>
<total type="fast" count="0" size="0"/>
<total type="rest" count="2" size="122641"/>
<system type="current" size="135168"/>
<system type="max" size="135168"/>
<aspace type="total" size="135168"/>
<aspace type="mprotect" size="135168"/>
<aspace type="subheaps" size="1"/>
</heap>
<heap nr="4">
<sizes>
  <unsorted from="1121" to="1121" total="1121" count="1"/>
</sizes>
<total type="fast" count="0" size="0"/>
<total type="rest" count="2" size="34177"/>
<system type="current" size="217088"/>
<system type="max" size="217088"/>
<aspace type="total" size="217088"/>
<aspace type="mprotect" size="217088"/>
<aspace type="subheaps" size="1"/>
</heap>
<heap nr="5">
<sizes>
  <unsorted from="1569" to="1569" total="1569" count="1"/>
</sizes>
<total type="fast" count="0" size="0"/>
<total type="rest" count="2" size="126625"/>
<system type="current" size="135168"/>
<system type="max" size="135168"/>
<aspace type="total" size="135168"/>
<aspace type="mprotect" size="135168"/>
<aspace type="subheaps" size="1"/>
</heap>
<total type="fast" count="0" size="0"/>
<total type="rest" count="10" size="648324"/>
<total type="mmap" count="0" size="0"/>
<system type="current" size="1175552"/>
<system type="max" size="1175552"/>
<aspace type="total" size="1175552"/>
<aspace type="mprotect" size="1175552"/>
</malloc>

edenhill · 2021-04-13T07:26:46Z

There is not much we can do about this in librdkafka, short of a redesign of the threading model, so I suggest you see if you can reuse the same producer instance (which is recommend unless you need different configs), or use another allocator, e.g, tcmalloc.

zhangwen-network · 2021-04-13T11:50:04Z

@edenhill Firstly, thank you very much for your answers about the issue. And i have a quesion that there are 3 threads below, why two of them are named with "broker-1" and "broker0"? Running only with producer can also start broker features? I only want the function of producer.
29533 root 20 0 165060 4092 3648 S 0.0 0.0 0:00.00 producer
29534 root 20 0 165060 4092 3648 S 0.0 0.0 0:03.51 rdk:main
29535 root 20 0 165060 4092 3648 S 0.0 0.0 0:00.10 rdk:broker-1
29536 root 20 0 165060 4092 3648 S 0.0 0.0 0:00.12 rdk:broker0

edenhill · 2021-04-13T11:55:01Z

broker -1 == bootstrap broker

rolandyoung · 2021-04-13T12:28:08Z

A "broker" thread is created for each broker the producer connects to. See https://github.com/edenhill/librdkafka/wiki/FAQ#number-of-broker-tcp-connections for an explanation of why there may be two broker connection threads even if there is only one broker in the cluster.

zhangwen-network · 2021-04-19T03:10:53Z

Link tcmalloc into my test-application, the VSZ reduced to 44M from 220M.

edenhill closed this as completed Apr 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage of librdkafka #3343

Memory usage of librdkafka #3343

YarinLowe commented Apr 11, 2021 •

edited

Loading

edenhill commented Apr 12, 2021

edenhill commented Apr 12, 2021

YarinLowe commented Apr 12, 2021

YarinLowe commented Apr 12, 2021 •

edited

Loading

zhangwen-network commented Apr 13, 2021

edenhill commented Apr 13, 2021

edenhill commented Apr 13, 2021 •

edited

Loading

edenhill commented Apr 13, 2021

zhangwen-network commented Apr 13, 2021

edenhill commented Apr 13, 2021

rolandyoung commented Apr 13, 2021

zhangwen-network commented Apr 19, 2021

Memory usage of librdkafka #3343

Memory usage of librdkafka #3343

Comments

YarinLowe commented Apr 11, 2021 • edited Loading

Description

How to reproduce

Checklist

edenhill commented Apr 12, 2021

edenhill commented Apr 12, 2021

YarinLowe commented Apr 12, 2021

YarinLowe commented Apr 12, 2021 • edited Loading

zhangwen-network commented Apr 13, 2021

edenhill commented Apr 13, 2021

edenhill commented Apr 13, 2021 • edited Loading

edenhill commented Apr 13, 2021

zhangwen-network commented Apr 13, 2021

edenhill commented Apr 13, 2021

rolandyoung commented Apr 13, 2021

zhangwen-network commented Apr 19, 2021

YarinLowe commented Apr 11, 2021 •

edited

Loading

YarinLowe commented Apr 12, 2021 •

edited

Loading

edenhill commented Apr 13, 2021 •

edited

Loading