-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RTCP Memory leak #66
Comments
can you please do --leak-check=full -show-reachable=yes to see the full picture ? |
Here is the valgrind --leak-check=full -show-reachable=yes output. |
Seeing the same here, multiple servers with call loads over > 200 with RTCP enabled filling up 8G of memory and swap. |
guys, can you please re-test with the last git version ? |
It's running since 5 minutes on two servers handling 80 SIP calls each, and remains stable for me. If it's ok for you, I will run it a little more before closing the issue. |
yes, please run it a bit longer and we will close the issue :-) thank you! |
Will run full production test as well over 12 nodes running RTCP. But will do so on Monday (as I don't want to test it over the weekend :) ) |
Looks like the majority of the issue has been resolved. However on a few hosts I still see RES increasing with 4 bytes every 1-2 seconds and I see the call volume is actually decreasing. Will keep monitoring. |
thanks for update. Please keep it run 2-3 days and see if memory goes up. thank you Wbr, On 30 May 2016 at 12:01, Matthias van der Vlies notifications@github.com
|
I also see memory growing a little bit, but not as fast as before. After a 2 days run with calls, RES growed from 188M to 340M as I write. I didn't run a valgrind on the new code but if there's still some "definitely lost" shouldn't we try to address it ? I can provide valgrind output as needed ;) I'll keep monitoring too. |
just pushed some patches to avoid potentials memory leaks after patches, I have only two points, but these can be ignored, since it can you please recompile and check on your host ? ==3875== HEAP SUMMARY: On 30 May 2016 at 12:14, lcligny notifications@github.com wrote:
|
lcligny are you using centos/rhel? I still see some leaking on el6 and edit: on debian too, but a lot lot less On 05/30/2016 12:14 PM, lcligny wrote:
|
Matthias, do you use the last git code ? On 30 May 2016 at 14:18, Matthias van der Vlies notifications@github.com
|
Yes: git log -1commit bbc7e6c
On 05/30/2016 02:26 PM, Alexandr Dubovikov wrote:
|
can you please run valgrind on one of your box ? On 30 May 2016 at 14:28, Matthias van der Vlies notifications@github.com
|
For the record, i'm running vanilla Debian 7 with asterisk 11 on those box. |
Correct me if I'm wrong as I just had a quick look, but I can't find any As you can see most of the memory is used in database_hash: ==7617== 28,840 bytes in 103 blocks are still reachable in loss record However: struct ipport_items *ipports Still references of course the call, so it's still referenced, but if I can find a reference to add_ipport, but none to clear_ipport: captagent/mod/proto_uni/proto_uni.c: I do see a timer is set there, but in the timer code it's not calling Any thoughts? On 05/30/2016 02:30 PM, Alexandr Dubovikov wrote:
|
the first part will be release during timer check: https://github.com/sipcapture/captagent/blob/master/src/modules/database/hash/captarray.c#L97 so, here just be all good. the second part definitive has an issue... checking. Wbr, On 30 May 2016 at 15:03, Matthias van der Vlies notifications@github.com
|
just checked one more time: the second part doesn't have any issue, because https://github.com/sipcapture/captagent/blob/master/src/modules/database/hash/database_hash.c#L287 https://github.com/sipcapture/captagent/blob/master/src/modules/database/hash/database_hash.c#L306 https://github.com/sipcapture/captagent/blob/master/src/modules/database/hash/database_hash.c#L309 I have just checked the expire_hash_rtcp value to rtcp_timeout and you can Wbr, On 30 May 2016 at 16:11, Alexandr Dubovikov alexandr.dubovikov@gmail.com
|
Ok, RES is now 616M on one of the machines, 260 on another. I will try latest git version with your rtcp-timeout changes |
RES are 187M and 203M after a non-stop 24h run, so for my setup and workload, the original issue is solved. I will let killdashnine continue his testing. |
thanks for update. Can you please let us know if RES will grow up or will Wbr, On 31 May 2016 at 10:01, lcligny notifications@github.com wrote:
|
276M and 282M RES today, so about 100M more than yesterday at the same time. I'm wondering if it will continue to grow steadily. In such case it would be difficult to just run and forget. |
Well on at least 3 machines captagent crashed after last git pull (I But, on one of the host I started with 38M RES and it's now 4G. On 06/01/2016 08:47 AM, lcligny wrote:
|
I have updated uthash. Can you please re-check again ? I will prepare redis version just to be sure that problem is really in the thanks for you help! Wbr, On 1 June 2016 at 08:47, lcligny notifications@github.com wrote:
|
Ok, I have updated all the machines now. I think the processes got killed because they used too much memory. Starting now around 36-39M RES on all machines and will leave it running for 24h unless I see big increase during the day. |
I have updated too, to see if it helps. Starting at 108M RES (I listen on 3 interfaces that's why I start at higher used mem). |
Small update, grown from 39 to 65, and captagent is using 100% CPU (I have seen this starting to happen after bc9a735) |
looks like I have found the memory leak. Can you please compile the last On 1 June 2016 at 13:25, Matthias van der Vlies notifications@github.com
|
Memory is looking good not seeing the 4Kbytes per 2 second, but cpu is 100% edit: spoken too soon, increasing rapidly now (4kB/s) |
I am not sure about cpu.... there is nothing changed that can impact such On 1 June 2016 at 13:57, Matthias van der Vlies notifications@github.com
|
I have not, using the default files, also see my update. Now it's On 06/01/2016 02:04 PM, Alexandr Dubovikov wrote:
|
just wait, this is memory hash grows up, at some moment it will stop. about CPU, can you please confirm that version before bc9a735 On 1 June 2016 at 14:06, Matthias van der Vlies notifications@github.com
|
any progress ? ;-) On 1 June 2016 at 14:10, Alexandr Dubovikov alexandr.dubovikov@gmail.com
|
Memory now on 51M (started at 37M) but there are 95 active calls on this Anyway I didn't have the chance yet to check that revision. There are This is so far only on 3 specific nodes running centos 6. Debian node is Will update you tomorrow. On 06/01/2016 04:14 PM, Alexandr Dubovikov wrote:
|
For information, after 12 hours with the last git on my two SIP boxes running captagent, the RES is at 120M and 119M now. It has started like everytime for me at 108M. |
so, this means the both memory leaks has been fixed. Correct ? On 2 June 2016 at 09:04, lcligny notifications@github.com wrote:
|
For my setup, yes. |
The max RES I have is 107M (started at 36M). I'll leave it running another day to see if it grows even bigger. Still seeing the 100% CPU issue with 2 threads running at 100% of which one is doing a Read constantly. Hopefully I will be able to perform some test with that today. |
Unfortunately RES is still growing but not as fast as previously, at 172M now. Didn't have a chance to figure out the 100% CPU issue yet. |
Matthias, are you sure that you have clean up and replaced all modules ? On 3 June 2016 at 11:50, Matthias van der Vlies notifications@github.com
|
Yes, running 'make uninstall' and 'make clean' first before pulling and compiling. |
@lcigny, how it looks for you ? Do you still have a memory leak ? On 3 June 2016 at 12:01, Matthias van der Vlies notifications@github.com
|
Nope, I'm still at 120M RES, it doesn't grow since yesterday morning. |
Matthias, can you check that after make uninstall all captagent's files On 3 June 2016 at 12:28, lcligny notifications@github.com wrote:
|
I can confirm that all files were removed, only configuration files were not deleted, but all .so/.a/.la and captagent binary were removed. I can also confirm that make clean deletes all the .so/a/o files from the source directory |
Matthias, can you check it on a test server if you have any ? Wbr, On 3 June 2016 at 12:49, Matthias van der Vlies notifications@github.com
|
Hi, this actually was a test server, checked it with locate (after running updatedb everytime) to make sure all files were deleted. |
any updates ? On 3 June 2016 at 13:54, Matthias van der Vlies notifications@github.com
|
Still ok and in production for me. Do you want me to close the issue ? |
I wanna, but not sure that's going on with Matthias. :-) On 7 June 2016 at 09:52, lcligny notifications@github.com wrote:
|
Sorry guys, bit busy on my end. RES is max 267M so I think on high call load it (hash table) extends, Still seeing the 100%CPU issue, having a busy week so I hope to get back On 06/07/2016 10:18 AM, Alexandr Dubovikov wrote:
|
ok, thank you, lets close this ticket as solved for memory leak and create a new On 7 June 2016 at 10:45, Matthias van der Vlies notifications@github.com
|
I have opened #70 for the CPU usage issue |
Hello,
On a system handling about 100 simultaneous calls, the captagent process VIRT, SHR and RES Memory is growing up rapidely, finally eating up all the ressources.
With "socketspcap_rtcp" to enable="false" the memory usage remains stable.
Here is a valgrind --leak-check=yes output for captagent with socketspcap_rtcp to "true" and 100 simultaneous calls during less than one minute:
The text was updated successfully, but these errors were encountered: