Skip to content

[FEATURE]SIP Call preservation in case of failover #3903

@abdelwahab706

Description

@abdelwahab706

Is your feature request related to a problem? Please describe.

Yes. In a High Availability (HA) OpenSIPS setup utilizing dialog replication and a Virtual IP (VIP), mid-dialog routing fails for active calls after a cluster failover occurs for clients connected via connection-oriented protocols (TCP/TLS).

Detailed Problem Description:

Connection Drop: When the primary server fails, the TCP/TLS connections to the active SIP clients are abruptly dropped.

Re-registration: Clients automatically re-register to the standby server (which has assumed the VIP). Because these are new TCP/TLS sessions, the clients establish new transport sockets.

Stale Dialog Data: The standby server possesses the replicated dialog state, but it still references the old, closed sockets/Contact URIs for the caller and callee.

Routing Failure: When the standby server attempts mid-dialog requests (such as Session Timers, re-INVITEs, or a BYE), it tries to route them through the defunct sockets. This results in transport send errors.

Impact: Active calls are left in a "stale" or ghost state on the endpoints until local timers expire, as the proxy cannot successfully deliver a BYE or handle mid-dialog keepalives.

Describe the solution you'd like

I would like a mechanism where OpenSIPS can dynamically update the active dialog state when a TCP/TLS client re-registers on a new socket following a failover.

Proposed Mechanism:

Unique Identification: Leverage a unique identifier—such as the +sip.instance parameter (RFC 5626) or a custom AVP containing the client's MAC address—which is tracked by both the registrar/usrloc modules and the dialog module.

Registration-Triggered Update: Upon a successful re-registration, OpenSIPS should check if the registering client (matched via the unique identifier) has any active ongoing calls in the replicated dialog table.

Socket Migration: If active dialogs are found, OpenSIPS should dynamically update the in-memory/replicated dialog state's Contact URIs and destination socket information with the newly established TCP/TLS transport socket. This will ensure that subsequent mid-dialog requests (re-INVITEs, BYEs) are successfully routed over the active connection
Implementation

  • Component: dialog module, usrloc module

  • Type: Module parameters and internal event callbacks

  • Name: update_on_registration, match_instance_id

  • Description: Add a parameter to the dialog module (e.g., modparam("dialog", "update_on_registration", 1)) that registers a callback to the usrloc module's contact insertion/update events. When triggered, if the contact contains a +sip.instance parameter, the dialog module scans active dialogs for a matching instance ID and updates the corresponding memory/replicated socket information.

Describe alternatives you've considered

  • Component: dialog module

  • Type: Script function

  • Name: fix_dialog_sockets([match_key])

  • Description: Introduce a new function exported by the dialog module that can be called inside the opensips.cfg file within the registration handling logic (after save()).
    Example usage:
    -if (is_method("REGISTER")) {
    if (save("location")) {
    # If client re-registered with a sip.instance,
    # fix any active dialog paths to use the new TCP/TLS socket
    if ($ct.fields(instance)) {
    fix_dialog_sockets("$ct.fields(instance)");
    }
    }
    }
    Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions