-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[automatic failover] Automatic failover client improvements (part 3) #4306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…components (#4298) * - set & test default values * - format * - fix tests failing due to changing defaults
…ure rate) capabililty to circuit breaker (#4295) * [automatic failover] Remove the check for 'GenericObjectPool.getNumWaiters()' in 'TrackingConnectionPool' (#4270) - remove the check for number of waitiers in TrackingConnectionPool * [automatic failover] Configure max total connections for EchoStrategy (#4268) - set maxtotal connections for echoStrategy * [automatic failover] Replace 'CircuitBreaker' with 'Cluster' for 'CircuitBreakerFailoverBase.clusterFailover' (#4275) * - replace CircuitBreaker with Cluster for CircuitBreakerFailoverBase.clusterFailover - improve thread safety with provider initialization * - formatting * [automatic failover] Minor optimizations on fast failover (#4277) * - minor optimizations on fail fast * - volatile failfast * [automatic failover] Implement health check retries (#4273) * - replace minConsecutiveSuccessCount with numberOfRetries - add retries into healtCheckImpl - apply changes to strategy implementations config classes - fix unit tests * - fix typo * - fix failing tests * - add tests for retry logic * - formatting * - format * - revisit numRetries for healthCheck ,replace with numProbes and implement built in policies - new types probecontext, ProbePolicy, HealthProbeContext - add delayer executor pool to healthcheckımpl - adjustments on worker pool of healthCheckImpl for shared use of workers * - format * - expand comment with example case * - drop pooled executor for delays * - polish * - fix tests * - formatting * - checking failing tests * - fix test * - fix flaky tests * - fix flaky test * - add tests for builtin probing policies * - fix flaky test * [automatic failover] Move failover provider to mcf (#4294) * - move failover provider to mcf * - make iterateActiveCluster package private * [automatic failover] Add SSL configuration support to LagAwareStrategy (#4291) * User-provided ssl config for lag-aware health check * ssl scenario test for lag-aware healthcheck * format * format * address review comments - use getters instead of fields * [automatic failover] Implement max number of failover attempts (#4293) * - implement max failover attempt - add tests * - fix user receive the intended exception * -clean+format * - java doc for exceptions * format * - more tests on excaption types in max failover attempts mechanism * format * fix failing timing in test * disable health checks * rename to switchToHealthyCluster * format * - Add dual-threshold (min failures + failure rate) failover to circuit breaker executor - Map config to resilience4j via CircuitBreakerThresholdsAdapter - clean up/simplfy config: drop slow-call and window type - Add thresholdMinNumOfFailures; update some of the defaults - Update provider to use thresholds adapter - Update docs; align examples with new defaults - Add tests for 0% rate, edge thresholds * polish * Update src/main/java/redis/clients/jedis/mcf/CircuitBreakerThresholdsAdapter.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * - fix typo * - fix min total calls calculation * format * - merge issues fixed * fix javadoc ref * - move threshold evaluations to failoverbase - simplfy executer and cbfailoverconnprovider - adjust config getters - fix failing tests due to COUNT_BASED -> TIME_BASED - new tests for thresholds calculations and impact on circuit state transitions * - avoid facilitating actual CBConfig type in tests * Update src/test/java/redis/clients/jedis/failover/FailoverIntegrationTest.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Trigger workflows * - evaluate only in failure recorded and failover immediately - add more test on threshold calculations - enable command line arg for overwriting surefire.excludedGroups * format * check pom * - fix error prone test * [automatic failover] Set and test default values for failover config&components (#4298) * - set & test default values * - format * - fix tests failing due to changing defaults * - fix flaky test * - remove unnecessary checks for failover attempt * - clean and trim adapter class - add docs and more explanantion * fix javadoc issue * - switch to all_succes to fix flaky timing * - fix issue in CircuitBreakerFailoverConnectionProvider * introduce ReflectionTestUtil --------- Co-authored-by: Ivo Gaydazhiev <ivo.gaydazhiev@redis.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…over and circuit breaker support (#4300) * feat: introduce ResilientRedisClient with multi-endpoint failover support Add ResilientRedisClient extending UnifiedJedis with automatic failover capabilities across multiple weighted Redis endpoints. Includes circuit breaker pattern, health monitoring, and configurable retry logic for high-availability Redis deployments. * format * mark ResilientRedisClientTest as integration one * fix test - make sure endpoint is healthy before activating it * Rename ResilientClient to align with design - ResilientClient -> MultiDbClient (builder, tests, etc) * Rename setActiveEndpoint to setActiveDatabaseEndpoint * Rename clusterSwitchListener to databaseSwitchListener * Rename multiClusterConfig to multiDbConfig * fix api doc's error * fix compilation error after rebase * format * fix example in javadoc * Update ActiveActiveFailoverTest scenariou test to use builder's # Conflicts: # src/test/java/redis/clients/jedis/scenario/ActiveActiveFailoverTest.java * rename setActiveDatabaseEndpoint -. setActiveDatabase * is healthy throw exception if cluster does not exists * format
…ti db (#4302) [clean up] Use Endpoint interface where possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a dual-threshold circuit breaker mechanism and a new MultiDbClient for enhanced Redis multi-endpoint failover. The changes replace single-threshold circuit breaking with a system requiring both minimum failure count AND failure rate thresholds to be exceeded, preventing false positives in low-traffic scenarios.
- Implementation of dual-threshold circuit breaker system requiring both minimum failures and failure rate thresholds
- Introduction of MultiDbClient and MultiDbClientBuilder for simplified multi-endpoint Redis management
- Updated default configuration values for more production-ready behavior (conservative timeouts, intervals)
Reviewed Changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
src/test/java/redis/clients/jedis/util/ReflectionTestUtil.java |
Utility for accessing private fields in tests using reflection |
src/test/java/redis/clients/jedis/util/ClientTestUtil.java |
Test utility for extracting connection providers from UnifiedJedis |
src/main/java/redis/clients/jedis/MultiDbClient.java |
New high-availability Redis client with multi-endpoint support |
src/main/java/redis/clients/jedis/builders/MultiDbClientBuilder.java |
Abstract builder for creating multi-database Redis clients |
src/main/java/redis/clients/jedis/mcf/CircuitBreakerThresholdsAdapter.java |
Adapter to disable Resilience4j evaluation in favor of custom logic |
Multiple test files | Updated to use new dual-threshold configuration and MultiDbClient |
src/main/java/redis/clients/jedis/MultiClusterClientConfig.java |
Major refactoring to support dual-threshold circuit breaker and new builder pattern |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
src/main/java/redis/clients/jedis/mcf/CircuitBreakerThresholdsAdapter.java
Outdated
Show resolved
Hide resolved
src/main/java/redis/clients/jedis/mcf/CircuitBreakerThresholdsAdapter.java
Show resolved
Hide resolved
src/main/java/redis/clients/jedis/mcf/CircuitBreakerThresholdsAdapter.java
Show resolved
Hide resolved
AtomicReference<String> interruptedThreadName = new AtomicReference<>(); | ||
AtomicReference<Throwable> thrownException = new AtomicReference<>(); | ||
AtomicReference<Boolean> isInterrupted = new AtomicReference<>(); | ||
// When: Interrupt thse waiting thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'thse' to 'the'.
// When: Interrupt thse waiting thread | |
// When: Interrupt the waiting thread |
Copilot uses AI. Check for mistakes.
|
||
public ClusterSwitchEventArgs(SwitchReason reason, Endpoint endpoint, Cluster cluster) { | ||
this.reason = reason; | ||
// TODO: @ggivo do we need cluster name? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO comment should be resolved or converted to a proper issue tracker item before merging to main branch.
// TODO: @ggivo do we need cluster name? |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Circuit Breaker Two-Threshold Failover and MultiDbClient Introduction
This PR introduces a new dual-threshold circuit breaker mechanism and a simplified MultiDbClient API for Redis multi-endpoint deployments. The changes enhance failover precision by requiring both minimum failure count AND failure rate thresholds to be exceeded before triggering failover, preventing false positives from small sample sizes.
🚀 New Features Added
1. MultiDbClient - Simplified Multi-Endpoint Redis Client
MultiDbClient
class extendingUnifiedJedis
for high-availability Redis connectivityMultiDbClientBuilder
abstract builder for creating multi-db Redis clientsdatabaseSwitchListener
2. Dual-Threshold Circuit Breaker System
circuitBreakerMinNumOfFailures
configuration - Minimum number of failures required before circuit breaker can tripCircuitBreakerThresholdsAdapter
- Disables Resilience4j's built-in evaluation to use Jedis custom logicevaluateThresholds()
method in Cluster class for custom threshold evaluation3. Enhanced Configuration Flexibility
MultiClusterClientConfig.builder()
- No-argument builder for dynamic endpoint additionendpoint(Endpoint, float, JedisClientConfig)
- Simplified endpoint addition methodendpoint(ClusterConfig)
- Pre-configured cluster addition method🔧 Core Improvements
1. Circuit Breaker Logic Overhaul
2. Configuration Defaults Optimization
3. API Modernization and Consistency
ClusterConfig
constructor now usesEndpoint
interface instead ofHostAndPort
getHostAndPort()
method renamed togetEndpoint()
for consistencyEndpoint
interface usage throughout the codebase4. Health Check System Enhancements
📦 Package and Structure Changes
1. Test Infrastructure Improvements
CircuitBreakerThresholdsTest
- Comprehensive tests for dual-threshold behaviorClusterEvaluateThresholdsTest
- Unit tests for threshold evaluation logicDefaultValuesTest
- Validation of all default configuration valuesMultiDbClientTest
- Integration tests for the new MultiDbClient APIReflectionTestUtil
- Utility class for test reflection operations2. Maven Configuration Updates
excludedGroupsForUnitTests
property for flexible test group exclusion🔄 API Changes
1. MultiClusterClientConfig Builder
2. MultiDbClient Usage
3. ClusterConfig Constructor Changes
🐛 Bug Fixes
1. Circuit Breaker State Management
2. Test Configuration Consistency
🎯 Behavioral Changes
1. Circuit Breaker Failover Logic
Before:
After:
2. Configuration Defaults
Before:
After:
3. API Usage Patterns
Before:
After:
📊 Configuration Examples
Basic Dual-Threshold Configuration
MultiDbClient with Event Handling
Dynamic Endpoint Management
API Changes
ClusterConfig constructor:
ClusterConfig(HostAndPort, JedisClientConfig)
→ClusterConfig(Endpoint, JedisClientConfig)
ClusterConfig getter method:
getHostAndPort()
→getEndpoint()
Circuit breaker configuration:
circuitBreakerMinNumOfFailures()
methodDefault Value Changes
Migration Guide
🧪 Testing
All changes validated with:
📝 Additional Notes