Is your feature request related to a problem? Please describe.
Yes. In a High Availability (HA) OpenSIPS setup utilizing dialog replication and a Virtual IP (VIP), mid-dialog routing fails for active calls after a cluster failover occurs for clients connected via connection-oriented protocols (TCP/TLS).
Detailed Problem Description:
Connection Drop: When the primary server fails, the TCP/TLS connections to the active SIP clients are abruptly dropped.
Re-registration: Clients automatically re-register to the standby server (which has assumed the VIP). Because these are new TCP/TLS sessions, the clients establish new transport sockets.
Stale Dialog Data: The standby server possesses the replicated dialog state, but it still references the old, closed sockets/Contact URIs for the caller and callee.
Routing Failure: When the standby server attempts mid-dialog requests (such as Session Timers, re-INVITEs, or a BYE), it tries to route them through the defunct sockets. This results in transport send errors.
Impact: Active calls are left in a "stale" or ghost state on the endpoints until local timers expire, as the proxy cannot successfully deliver a BYE or handle mid-dialog keepalives.
Describe the solution you'd like
I would like a mechanism where OpenSIPS can dynamically update the active dialog state when a TCP/TLS client re-registers on a new socket following a failover.
Proposed Mechanism:
Unique Identification: Leverage a unique identifier—such as the +sip.instance parameter (RFC 5626) or a custom AVP containing the client's MAC address—which is tracked by both the registrar/usrloc modules and the dialog module.
Registration-Triggered Update: Upon a successful re-registration, OpenSIPS should check if the registering client (matched via the unique identifier) has any active ongoing calls in the replicated dialog table.
Socket Migration: If active dialogs are found, OpenSIPS should dynamically update the in-memory/replicated dialog state's Contact URIs and destination socket information with the newly established TCP/TLS transport socket. This will ensure that subsequent mid-dialog requests (re-INVITEs, BYEs) are successfully routed over the active connection
Implementation
-
Component: dialog module, usrloc module
-
Type: Module parameters and internal event callbacks
-
Name: update_on_registration, match_instance_id
-
Description: Add a parameter to the dialog module (e.g., modparam("dialog", "update_on_registration", 1)) that registers a callback to the usrloc module's contact insertion/update events. When triggered, if the contact contains a +sip.instance parameter, the dialog module scans active dialogs for a matching instance ID and updates the corresponding memory/replicated socket information.
Describe alternatives you've considered
-
Component: dialog module
-
Type: Script function
-
Name: fix_dialog_sockets([match_key])
-
Description: Introduce a new function exported by the dialog module that can be called inside the opensips.cfg file within the registration handling logic (after save()).
Example usage:
-if (is_method("REGISTER")) {
if (save("location")) {
# If client re-registered with a sip.instance,
# fix any active dialog paths to use the new TCP/TLS socket
if ($ct.fields(instance)) {
fix_dialog_sockets("$ct.fields(instance)");
}
}
}
Additional context
Is your feature request related to a problem? Please describe.
Yes. In a High Availability (HA) OpenSIPS setup utilizing dialog replication and a Virtual IP (VIP), mid-dialog routing fails for active calls after a cluster failover occurs for clients connected via connection-oriented protocols (TCP/TLS).
Detailed Problem Description:
Connection Drop: When the primary server fails, the TCP/TLS connections to the active SIP clients are abruptly dropped.
Re-registration: Clients automatically re-register to the standby server (which has assumed the VIP). Because these are new TCP/TLS sessions, the clients establish new transport sockets.
Stale Dialog Data: The standby server possesses the replicated dialog state, but it still references the old, closed sockets/Contact URIs for the caller and callee.
Routing Failure: When the standby server attempts mid-dialog requests (such as Session Timers, re-INVITEs, or a BYE), it tries to route them through the defunct sockets. This results in transport send errors.
Impact: Active calls are left in a "stale" or ghost state on the endpoints until local timers expire, as the proxy cannot successfully deliver a BYE or handle mid-dialog keepalives.
Describe the solution you'd like
I would like a mechanism where OpenSIPS can dynamically update the active dialog state when a TCP/TLS client re-registers on a new socket following a failover.
Proposed Mechanism:
Unique Identification: Leverage a unique identifier—such as the +sip.instance parameter (RFC 5626) or a custom AVP containing the client's MAC address—which is tracked by both the registrar/usrloc modules and the dialog module.
Registration-Triggered Update: Upon a successful re-registration, OpenSIPS should check if the registering client (matched via the unique identifier) has any active ongoing calls in the replicated dialog table.
Socket Migration: If active dialogs are found, OpenSIPS should dynamically update the in-memory/replicated dialog state's Contact URIs and destination socket information with the newly established TCP/TLS transport socket. This will ensure that subsequent mid-dialog requests (re-INVITEs, BYEs) are successfully routed over the active connection
Implementation
Component: dialog module, usrloc module
Type: Module parameters and internal event callbacks
Name: update_on_registration, match_instance_id
Description: Add a parameter to the dialog module (e.g., modparam("dialog", "update_on_registration", 1)) that registers a callback to the usrloc module's contact insertion/update events. When triggered, if the contact contains a +sip.instance parameter, the dialog module scans active dialogs for a matching instance ID and updates the corresponding memory/replicated socket information.
Describe alternatives you've considered
Component: dialog module
Type: Script function
Name: fix_dialog_sockets([match_key])
Description: Introduce a new function exported by the dialog module that can be called inside the opensips.cfg file within the registration handling logic (after save()).
Example usage:
-if (is_method("REGISTER")) {
if (save("location")) {
# If client re-registered with a sip.instance,
# fix any active dialog paths to use the new TCP/TLS socket
if ($ct.fields(instance)) {
fix_dialog_sockets("$ct.fields(instance)");
}
}
}
Additional context