Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topoh: PKG memory leak #2027

Closed
WhiteyDude opened this issue Aug 6, 2019 · 9 comments

Comments

@WhiteyDude
Copy link
Contributor

commented Aug 6, 2019

Description

I've noticed PKG memory increasing since enabling the topoh module last week in a significant manner (~7MB per worker over 3000 calls split between 4 workers). Removing the topoh module removed the memory usage.

Here is a pkg dump of one of the process post 3000 calls:

https://paste.ubuntu.com/p/85KYg7gD2z/

Discussion before reaching this conclusion:

https://lists.kamailio.org/pipermail/sr-users/2019-August/106345.html

Additional Information

kamailio 5.3.0-dev4 (x86_64/linux) 6d43ea-dirty.

I'm unsure why it's tagged dirty, as I simply cloned master and reset to 6d43eac.

I do plan to test this on dev7, however an unrelated issue is preventing me from testing this with my app_ruby based code. I will update if/when I am able to do this.

@WhiteyDude

This comment has been minimized.

Copy link
Contributor Author

commented Aug 7, 2019

Update: app_ruby issues have been fixed in latest master. I'll build a new machine on latest master tonight and split some live calls over there with topoh enabled to see if we can replicate the issue in master. Will report back in ~ 24 hours

@WhiteyDude

This comment has been minimized.

Copy link
Contributor Author

commented Aug 9, 2019

Tested with kamailio 5.3.0-dev7 (x86_64/linux) 4c537a (4c537a6).

Initial (after a couple of hundred test calls)l: https://paste.ubuntu.com/p/3PNrTRf8Kw/

3881 calls later: https://paste.ubuntu.com/p/Pg6dsQ94qt/

It looks like this is still happening in the current topoh.

miconda added a commit that referenced this issue Aug 12, 2019
- this event can execute a series of callbacks, leading to leak if only
the core function does pkg free
- GH #2027
@miconda

This comment has been minimized.

Copy link
Member

commented Aug 12, 2019

This may have been a result of having two modules using the same callback to update the content of the outbound buffer, in your case the topoh (hiding header attributes) and dialog (updating the cseq).

I pushed a bunch of patches to core and modules using this callback. They are in master branch for now. Can you test? If all ok I can backport.

@henningw

This comment has been minimized.

Copy link
Contributor

commented Aug 12, 2019

Ah, this might be the cause that I did not managed to reproduce it with only one module..

@WhiteyDude

This comment has been minimized.

Copy link
Contributor Author

commented Aug 13, 2019

Hi @miconda. Thanks so much for this!

I tested with kamailio 5.3.0-dev7 (x86_64/linux) 4c537a

Unfortunately after ~ 20 hours/10k calls:

https://paste.ubuntu.com/p/vBmpQXKztj/

I somehow lost the dump I took initially, however please find attached graph using the flawed methodology I used previously of summing together all children memory usage. Even though it's not a great measurement, it does prove the continued growth:

Screen Shot 2019-08-14 at 9 17 04 am

@miconda

This comment has been minimized.

Copy link
Member

commented Aug 14, 2019

The version you used is before the fix -- the commit id in your version points to:

commit 4c537a618949eb48ffed9297f3abc8cdc879b70d
Author: Daniel-Constantin Mierla <miconda@gmail.com>
Date:   Tue Aug 6 21:37:56 2019 +0200

    misc_radius: increase MAX_EXTRA from 4 to 8

To have the version with the fix, it has to be after the commit:

commit c79dfbeab0bfefaa4dd5cefc41cba3ba157da0ce
Author: Daniel-Constantin Mierla <miconda@gmail.com>
Date:   Mon Aug 12 11:18:39 2019 +0200

    corex: free old outbound buffer inside SREV_NET_DATA_OUT callback
@WhiteyDude

This comment has been minimized.

Copy link
Contributor Author

commented Aug 14, 2019

Whoops! Thanks Daniel, looks like I forgot to make install on my UAT server.

I'll re-test and update ASAP.

@WhiteyDude

This comment has been minimized.

Copy link
Contributor Author

commented Aug 15, 2019

Hi @miconda . I can confirm that this patch works. I put through ~ 4000 calls and roughly a 100kb increase in memory across all children (25kb each), and only as the first couple of calls came in. After that, 0 growth.

I believe this is fixed, thanks so much! I'll leave this open as you said you wish to backport.

miconda added a commit that referenced this issue Aug 20, 2019
- this event can execute a series of callbacks, leading to leak if only
the core function does pkg free
- GH #2027

(cherry picked from commit 98249e2)
miconda added a commit that referenced this issue Aug 20, 2019
miconda added a commit that referenced this issue Aug 20, 2019
@miconda

This comment has been minimized.

Copy link
Member

commented Aug 20, 2019

Patches are backported to branch 5.2.

@miconda miconda closed this Aug 20, 2019
miconda added a commit that referenced this issue Sep 24, 2019
- this event can execute a series of callbacks, leading to leak if only
the core function does pkg free
- GH #2027

(cherry picked from commit 98249e2)
(cherry picked from commit 891bd92)
miconda added a commit that referenced this issue Sep 24, 2019
- GH #2027

(cherry picked from commit 37a1765)
(cherry picked from commit aaeb0ca)
miconda added a commit that referenced this issue Sep 24, 2019
- GH #2027

(cherry picked from commit 21816a1)
(cherry picked from commit bbac072)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.