Skip to content

Slower debounce time in FRR reloading#1583

Merged
fedepaol merged 3 commits intometallb:mainfrom
fedepaol:fixmemleak
Sep 6, 2022
Merged

Slower debounce time in FRR reloading#1583
fedepaol merged 3 commits intometallb:mainfrom
fedepaol:fixmemleak

Conversation

@fedepaol
Copy link
Copy Markdown
Member

@fedepaol fedepaol commented Sep 2, 2022

In corner cases where we have a bouncing service, the number of updates requested to the reloader is higher than how fast the reloader is able to consume those requests. The requests are mutexed, so what happens is that we have a long (and growing) list of reloader child processes spun up by the sighup that we are not able to estinguish.
This consumes memory and may end up in having the speaker killed.

To fix this, we make the debounce timeout longer

Fixes #1581

@fedepaol fedepaol changed the title Reloader optimizations Slower debounce time in FRR reloading Sep 5, 2022
It might be useful to know how long an operation lasts for debugging
purpouses.

Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
It normally takes 2-3 seconds to execute a full reload command. In case
of multiple requests in flight, this translates in accumulated requests
that end up in pending sighup requests that then translates into
allocated memory (basically we have one child frr_reloader process per
reload request, consuming memory).

In case a service is constantly updated (for example, a dangling
endpoint that makes the service available / not available too fast), we
may end up accumulating memory that we are not able to free fast
(because the number of requests is higher than how fast the reloader can
serve them).

Because of this, it's safer to accumulate the requests for at least how
long it takes to serve one.

Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
In case the same configuration is propagated to the reloader, we ignore
it instead of calling frr again.

Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
Copy link
Copy Markdown
Member

@oribon oribon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Endless loop and crash

2 participants