Skip to content

Conversation

ggivo
Copy link
Collaborator

@ggivo ggivo commented Oct 1, 2025

This PR introduces ResilientRedisClient, a new high-availability Redis client that extends UnifiedJedis with automatic failover capabilities across multiple weighted endpoints.

Key Features

  • Multi-Endpoint Support: Configure multiple Redis endpoints with individual weights
  • Automatic Failover: Seamless switching to backup endpoints when the primary becomes unavailable
  • Circuit Breaker Pattern: Built-in circuit breaker with configurable thresholds
  • Health Monitoring: Continuous health checks with automatic failback to recovered endpoints
  • Dynamic Management: Add/remove endpoints at runtime without client restart
  • Event-Driven Monitoring: Listen to cluster switch events for alerting and observability

Components Added

  • ResilientRedisClient - Main client class extending UnifiedJedis
  • ResilientClientBuilder - Fluent builder for client configuration

Usage Example

class MultiDbClientUsageExample {

    public static void main(String[] args) {
        HostAndPort east = new HostAndPort("localhost", 29379);
        RedisCredentials credentialsEast = new DefaultRedisCredentials("default", "secretEast");

        HostAndPort west = new HostAndPort("localhost", 29380);
        RedisCredentials credentialsWest = new DefaultRedisCredentials("default", "secretWest");


        MultiDbClient client = MultiDbClient.builder()
                .multiDbConfig(
                        MultiClusterClientConfig.builder()
                                .endpoint(
                                        ClusterConfig.builder(
                                                        east,
                                                        DefaultJedisClientConfig.builder().credentials(credentialsEast).build())
                                                .weight(100.0f)
                                                .build())
                                .endpoint(ClusterConfig.builder(
                                                west,
                                                DefaultJedisClientConfig.builder().credentials(credentialsWest).build())
                                        .weight(50.0f).build())
                                .circuitBreakerFailureRateThreshold(50.0f)
                                .retryMaxAttempts(3)
                                .build()
                )
                .databaseSwitchListener(event -> System.out.println("Switched to: " + event.getEndpoint()))
                .build();

        String infoEast = client.info("server");
        System.out.println( "Before endpoint switch : Info East: " + infoEast);

        client.setActiveDatabaseEndpoint(west);

        String infoWest = client.info("server");
        System.out.println( "Before endpoint switch : Info West: " + infoWest);

        client.close();
    }

}

Testing

  • Integration test

Based on PR : #4263.
Important changes in commit : feat: introduce ResilientRedisClient with multi-endpoint failover sup…

Closes #4299

@ggivo ggivo force-pushed the ggivo/add-builders-reselient-client branch from 688dd98 to 87bac3d Compare October 1, 2025 13:30
Copy link

github-actions bot commented Oct 1, 2025

Test Results

   283 files  +1     283 suites  +1   11m 24s ⏱️ -1s
10 076 tests +8  10 031 ✅ +8  45 💤 ±0  0 ❌ ±0 
 2 714 runs  +8   2 714 ✅ +8   0 💤 ±0  0 ❌ ±0 

Results for commit f75c545. ± Comparison against base commit 11308f0.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@uglide uglide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to do a massive renaming to align the naming with the design:

  • ResilientClient -> MultiDbClient (builder, tests, etc)
  • multiClusterConfig -> multiDbConfig
  • ClusterConfig -> DatabaseConfig
  • clusterSwitchListener -> databaseSwitchListener
  • setActiveEndpoint -> setActiveDatabaseEndpoint

@ggivo ggivo changed the title [automatic failover] feat: Add ResilientRedisClient with multi-endpoint failover and circuit breaker support [automatic failover] feat: Add MultiDbClient with multi-endpoint failover and circuit breaker support Oct 2, 2025
@ggivo
Copy link
Collaborator Author

ggivo commented Oct 2, 2025

We need to do a massive renaming to align the naming with the design:

  • ResilientClient -> MultiDbClient (builder, tests, etc)
  • multiClusterConfig -> multiDbConfig
  • ClusterConfig -> DatabaseConfig
  • clusterSwitchListener -> databaseSwitchListener
  • setActiveEndpoint -> setActiveDatabaseEndpoint

Some renaming will conflict with -> #4295
Will do one round (updating newly introduces Client/Builders methods)

And another one after #4295 is merged for remaining changes ( ClusterConfig -> DatabaseConfig ...)

@ggivo ggivo force-pushed the ggivo/add-builders-reselient-client branch 2 times, most recently from b5c96cb to 173bdfa Compare October 2, 2025 17:16
@ggivo ggivo changed the base branch from master to feature/automatic-failover-3 October 3, 2025 08:35
@ggivo ggivo force-pushed the feature/automatic-failover-3 branch from 8bbfd71 to e22cd06 Compare October 3, 2025 09:04
ggivo added 13 commits October 3, 2025 13:43
…port

Add ResilientRedisClient extending UnifiedJedis with automatic failover
capabilities across multiple weighted Redis endpoints. Includes circuit
breaker pattern, health monitoring, and configurable retry logic for
high-availability Redis deployments.
  - make sure endpoint is healthy before activating it
 - ResilientClient -> MultiDbClient (builder, tests, etc)
# Conflicts:
#	src/test/java/redis/clients/jedis/scenario/ActiveActiveFailoverTest.java
@ggivo ggivo force-pushed the ggivo/add-builders-reselient-client branch from 173bdfa to b48c3e4 Compare October 3, 2025 10:48
Copy link
Contributor

@atakavci atakavci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ggivo ggivo merged commit f69484a into feature/automatic-failover-3 Oct 3, 2025
12 checks passed
atakavci added a commit that referenced this pull request Oct 6, 2025
…4306)

* [automatic failover] Set and test default values for failover config&components (#4298)

* - set & test default values

* - format

* - fix tests failing due to changing defaults

* [automatic failover] Add dual thresholds (min num of failures + failure rate) capabililty to circuit breaker (#4295)

* [automatic failover] Remove the check for 'GenericObjectPool.getNumWaiters()' in 'TrackingConnectionPool' (#4270)

- remove the check for number of waitiers in TrackingConnectionPool

* [automatic failover] Configure max total connections for EchoStrategy (#4268)

- set maxtotal connections for echoStrategy

* [automatic failover] Replace 'CircuitBreaker' with 'Cluster' for 'CircuitBreakerFailoverBase.clusterFailover' (#4275)

* - replace CircuitBreaker with Cluster for CircuitBreakerFailoverBase.clusterFailover
- improve thread safety with provider initialization

* - formatting

* [automatic failover] Minor optimizations on fast failover (#4277)

* - minor optimizations on fail fast

* -  volatile failfast

* [automatic failover] Implement health check retries (#4273)

* - replace minConsecutiveSuccessCount with numberOfRetries
- add retries into healtCheckImpl
- apply changes to strategy implementations config classes
- fix unit tests

* - fix typo

* - fix failing tests

* - add tests for retry logic

* - formatting

* - format

* - revisit numRetries for healthCheck ,replace with numProbes and implement built in policies
- new types probecontext, ProbePolicy, HealthProbeContext
- add delayer executor pool to healthcheckımpl
-  adjustments on  worker pool of healthCheckImpl for shared use of workers

* - format

* - expand comment with example case

* - drop pooled executor for delays

* - polish

* - fix tests

* - formatting

* - checking failing tests

* - fix test

* - fix flaky tests

* - fix flaky test

* - add tests for builtin probing policies

* - fix flaky test

* [automatic failover] Move failover provider to mcf (#4294)

* - move failover provider to mcf

* - make iterateActiveCluster package private

* [automatic failover]  Add SSL configuration support to LagAwareStrategy  (#4291)

* User-provided ssl config for lag-aware health check

* ssl scenario test for lag-aware healthcheck

* format

* format

* address review comments

  - use getters instead of fields

* [automatic failover] Implement max number of failover attempts (#4293)

* - implement max failover attempt
- add tests

* - fix user receive the intended exception

* -clean+format

* - java doc for exceptions

* format

* - more tests on excaption types in max failover attempts mechanism

* format

* fix failing timing in test

* disable health checks

* rename to switchToHealthyCluster

* format

* - Add dual-threshold (min failures + failure rate) failover to circuit breaker executor
- Map config to resilience4j via CircuitBreakerThresholdsAdapter
- clean up/simplfy config: drop slow-call and window type
- Add thresholdMinNumOfFailures; update some of the defaults
- Update provider to use thresholds adapter
- Update docs; align examples with new defaults
- Add tests for 0% rate, edge thresholds

* polish

* Update src/main/java/redis/clients/jedis/mcf/CircuitBreakerThresholdsAdapter.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* - fix typo

* - fix min total calls calculation

* format

* - merge issues fixed

* fix javadoc ref

* - move threshold evaluations to failoverbase
- simplfy executer and cbfailoverconnprovider
- adjust config getters
- fix failing tests due to COUNT_BASED -> TIME_BASED
- new tests for thresholds calculations and impact on circuit state transitions

* - avoid facilitating actual CBConfig type in tests

* Update src/test/java/redis/clients/jedis/failover/FailoverIntegrationTest.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Trigger workflows

* - evaluate only in failure recorded and failover immediately
- add more test on threshold calculations
- enable command line arg for overwriting surefire.excludedGroups

* format

* check pom

* - fix error prone test

* [automatic failover] Set and test default values for failover config&components (#4298)

* - set & test default values

* - format

* - fix tests failing due to changing defaults

* - fix flaky test

* - remove unnecessary checks for failover attempt

* - clean and trim adapter class
- add docs and more explanantion

* fix javadoc issue

* - switch to all_succes to fix flaky timing

* - fix issue in CircuitBreakerFailoverConnectionProvider

* introduce ReflectionTestUtil

---------

Co-authored-by: Ivo Gaydazhiev <ivo.gaydazhiev@redis.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* [automatic failover] feat: Add MultiDbClient with multi-endpoint failover and circuit breaker support (#4300)

* feat: introduce ResilientRedisClient with multi-endpoint failover support

Add ResilientRedisClient extending UnifiedJedis with automatic failover
capabilities across multiple weighted Redis endpoints. Includes circuit
breaker pattern, health monitoring, and configurable retry logic for
high-availability Redis deployments.

* format

* mark ResilientRedisClientTest as integration one

* fix test
  - make sure endpoint is healthy before activating it

* Rename ResilientClient to align with design

 - ResilientClient -> MultiDbClient (builder, tests, etc)

* Rename setActiveEndpoint to setActiveDatabaseEndpoint

* Rename clusterSwitchListener to databaseSwitchListener

* Rename multiClusterConfig to multiDbConfig

* fix api doc's error

* fix compilation error after rebase

* format

* fix example in javadoc

* Update ActiveActiveFailoverTest scenariou test to use builder's

# Conflicts:
#	src/test/java/redis/clients/jedis/scenario/ActiveActiveFailoverTest.java

* rename setActiveDatabaseEndpoint -. setActiveDatabase

* is healthy throw exception if cluster does not exists

* format

* [automatic failover]Use Endpoint interface instead HostAndPort in multi db (#4302)

[clean up] Use Endpoint interface where possible

* - fix variable name type

* fix typo in variable name

* - fix flaky test

---------

Co-authored-by: Ivo Gaydazhiev <ivo.gaydazhiev@redis.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
ggivo added a commit that referenced this pull request Oct 6, 2025
…over and circuit breaker support (#4300)

* feat: introduce ResilientRedisClient with multi-endpoint failover support

Add ResilientRedisClient extending UnifiedJedis with automatic failover
capabilities across multiple weighted Redis endpoints. Includes circuit
breaker pattern, health monitoring, and configurable retry logic for
high-availability Redis deployments.

* format

* mark ResilientRedisClientTest as integration one

* fix test
  - make sure endpoint is healthy before activating it

* Rename ResilientClient to align with design

 - ResilientClient -> MultiDbClient (builder, tests, etc)

* Rename setActiveEndpoint to setActiveDatabaseEndpoint

* Rename clusterSwitchListener to databaseSwitchListener

* Rename multiClusterConfig to multiDbConfig

* fix api doc's error

* fix compilation error after rebase

* format

* fix example in javadoc

* Update ActiveActiveFailoverTest scenariou test to use builder's

# Conflicts:
#	src/test/java/redis/clients/jedis/scenario/ActiveActiveFailoverTest.java

* rename setActiveDatabaseEndpoint -. setActiveDatabase

* is healthy throw exception if cluster does not exists

* format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[automatic failover] Add MultiDbClient with multi-endpoint failover and circuit breaker support
3 participants