Skip to content

[Feature] Port SameAuthParamsLookupAutoClusterFailover from Java client to C++ #571

@graham-macdonald-simplisafe

Description

Motivation

The Java client introduced SameAuthParamsLookupAutoClusterFailover in apache/pulsar#23129 (merged August 2024, released in Pulsar 4.0.0 and backported to 3.0.7 / 3.3.2). This ServiceUrlProvider implementation addresses a well-known reliability gap in AutoClusterFailover that is particularly relevant to geo-replication deployments sitting behind a Pulsar Proxy.

The problem with AutoClusterFailover: its health probe is a raw TCP connection. In a typical deployment where a Pulsar Proxy fronts the brokers, the TCP probe succeeds as soon as the proxy accepts the connection — even if all brokers behind the proxy have crashed. This means AutoClusterFailover cannot detect broker-layer failure and may reconnect clients to a cluster that is not actually serving requests.

What SameAuthParamsLookupAutoClusterFailover does differently:

  • Probes cluster health via a topic lookup (getBroker() on a configurable test topic) rather than a raw TCP connection. A broker that can respond to a lookup is demonstrably processing requests — the proxy cannot mask broker failure here.
  • Introduces a hysteresis state machine with separate failoverThreshold and recoverThreshold counters (default 5 each), requiring consecutive failures before cutting over and consecutive successes before switching back. This prevents flapping without requiring a coarse switchBackDelay timer.
  • Targets geo-replication topologies where all clusters share the same authentication credentials, which is the common case.

Request

Port SameAuthParamsLookupAutoClusterFailover to the C++ client.

The ServiceInfoProvider interface is already part of the C++ public API (include/pulsar/ServiceInfoProvider.h), and AutoClusterFailover is already implemented against it — so the interface contract is defined and the pattern is established. The Java implementation (SameAuthParamsLookupAutoClusterFailover.java) serves as a direct reference.

Impact

The C++ client is the foundation for the Node.js client binding. Once SameAuthParamsLookupAutoClusterFailover is available in C++, it can be surfaced to Node.js consumers as well — a client language that currently has no automatic failover support at all.

This would bring C++ and Node.js deployments to parity with Java on the most important AutoClusterFailover reliability fix for proxy-fronted geo-replication clusters.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions