rtpengine: refactor of node probing #2597

john08burke · 2021-08-11T05:37:26Z

This PR is a slight refactor of the existing probing mechanism for disabled rtpengine nodes. Currently, probing is done within the SIP processing context. This can lead to excessive delay while processing SIP requests when 1+ nodes are down or unreachable. For example, theoretical max delay in SIP context is (# down nodes) * (rtpengine_retr * rtpengine_tout) seconds every rtpengine_disable_tout second interval when the down nodes are selected via weighting algorithm.

This PR proposes to move the probing mechanism out of SIP context and into a timer routine that triggers every rtpengine_timer_interval second interval. This should eliminate delays associated with probing disabled rtpengine nodes.

Let me know your thoughts! The same could likely be applied to the rtpproxy module, but I wanted to get feedback before refactoring rtpproxy.

Probing of disabled rtpengine nodes is now done in timer routine instead of SIP context.

razvancrainea

The code looks good, and I indeed agree this new approach is better than the initial one.
Could you please address the change I suggested? If you do, I think we're good to merge it.

modules/rtpengine/rtpengine.c

razvancrainea · 2021-08-13T08:02:55Z

Thanks for the PR, John! The rtpproxy one is welcome as well, if you have time to take care of it :).

john08burke · 2021-08-13T15:39:03Z

Hey @razvancrainea, I will try to put a PR in for the rtpproxy module. I was first going to look into an additional PR for rtpengine, where rtpengine_reload mi command wouldn't grab global lock on the node list. When nodes are down, it has similar downsides that this PR addressed!

rtpengine: refactor of node probing

9ccd25f

Probing of disabled rtpengine nodes is now done in timer routine instead of SIP context.

razvancrainea requested changes Aug 12, 2021

View reviewed changes

modules/rtpengine/rtpengine.c Outdated Show resolved Hide resolved

john08burke added 2 commits August 12, 2021 10:54

rtpengine: use DELAY_ON_DELAY instead of SKIP_ON_DELAY timer flag

b3dd7c5

rtpengine: check for empty list before processing timer job

dc4a2b9

john08burke force-pushed the rtpengine_probing_refactor branch from 959ed5a to dc4a2b9 Compare August 12, 2021 17:08

razvancrainea approved these changes Aug 13, 2021

View reviewed changes

razvancrainea merged commit 15321ba into OpenSIPS:master Aug 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rtpengine: refactor of node probing #2597

rtpengine: refactor of node probing #2597

john08burke commented Aug 11, 2021

razvancrainea left a comment

razvancrainea commented Aug 13, 2021

john08burke commented Aug 13, 2021

rtpengine: refactor of node probing #2597

rtpengine: refactor of node probing #2597

Conversation

john08burke commented Aug 11, 2021

razvancrainea left a comment

Choose a reason for hiding this comment

razvancrainea commented Aug 13, 2021

john08burke commented Aug 13, 2021