rtpengine: refactor of node probing #2597
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a slight refactor of the existing probing mechanism for disabled rtpengine nodes. Currently, probing is done within the SIP processing context. This can lead to excessive delay while processing SIP requests when 1+ nodes are down or unreachable. For example, theoretical max delay in SIP context is (
# down nodes
) * (rtpengine_retr
*rtpengine_tout
) seconds everyrtpengine_disable_tout
second interval when the down nodes are selected via weighting algorithm.This PR proposes to move the probing mechanism out of SIP context and into a timer routine that triggers every
rtpengine_timer_interval
second interval. This should eliminate delays associated with probing disabled rtpengine nodes.Let me know your thoughts! The same could likely be applied to the rtpproxy module, but I wanted to get feedback before refactoring rtpproxy.