New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a bug where all healthy endpoints are removed as HealthCheckedEndpointGroup
is initialized
#5343
Conversation
…dpointGroup` is initialized Motivation: If new `Endpoint`s are updated in `HealthCheckedEndpointGroup`, `HealthCheckedEndpointGroup` creates a new `HealthCheckContextGroup`. When the new `HealthCheckContextGroup` is initialized, the old `HealthCheckContextGroup` are removed for rolling updates. The removal logic is added as a fallback to `contextGroup.whenInitialized()`. https://github.com/line/armeria/blob/3f54be0ce4370b24977994e247fa3816fde25e29/core/src/main/java/com/linecorp/armeria/client/endpoint/healthcheck/HealthCheckedEndpointGroup.java#L175-L187 Since context groups are stored in the order in which they were inserted, if the callback added first is executed first, the old value will always be deleted and the latest value will be maintained. `CompletableFuture` uses a stack structure for callbacks so the last callback will be executed first when it completes. If all contexts has the same endpoints, the old context group could remove the new context group. For example, `D` context group removes `A` to `C` context groups and then `C` context group removes groups until it finds itself. As a result, `contextGroupChain` becomes empty. This situation will rarely occur when: - An `HealthCheckedEndpointGroup` is not initialized yet. - The delegate is updated new endpoints which have the same value as the previous endpoints, but there are duplicate endpoints. The different length of endpoints creates a new `HealthCheckContextGroup` that shares all `HttpHealthChecker`s with the previous one. Modifications: - Do not try to remove old `HealthCheckContextGroup` with a context group if it was removed before. - Fix a bug where the reference count of `DefaultHealthCheckerContext` is incorrectly counted if there are duplicate endpoints. - Use the endpoints selected by `HealthCheckStrategy` instead of using the original endpoints updated by the delegate. Result: You no longer see `EndpointSelectionTimeoutException` when `HealthCheckedEndpointGroup` is initialized with duplicate `Endpoint`s.
if (!contextGroupChain.contains(contextGroup)) { | ||
// The contextGroup is already removed by another callback of `contextGroup.whenInitialized()`. | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change prevents the old contextGroup
from deleting new contextGroup
s.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5343 +/- ##
=========================================
Coverage 73.94% 73.95%
- Complexity 20104 20110 +6
=========================================
Files 1730 1730
Lines 74161 74165 +4
Branches 9465 9467 +2
=========================================
+ Hits 54841 54849 +8
+ Misses 14844 14840 -4
Partials 4476 4476 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CompletableFuture uses a stack structure for callbacks, so the last callback will be executed first when it completes.
TIL! The changes make sense. Thanks @ikhoon 🙇 👍 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice finding. Thanks, @ikhoon! 👍
Motivation:
If new
Endpoint
s are updated inHealthCheckedEndpointGroup
,HealthCheckedEndpointGroup
creates a newHealthCheckContextGroup
. When the newHealthCheckContextGroup
is initialized, the oldHealthCheckContextGroup
is removed for rolling updates.The removal logic is added as a fallback to
contextGroup.whenInitialized()
.armeria/core/src/main/java/com/linecorp/armeria/client/endpoint/healthcheck/HealthCheckedEndpointGroup.java
Lines 175 to 187 in 3f54be0
CompletableFuture
uses a stack structure for callbacks, so the last callback will be executed first when it completes. If all contexts see the same future completion event, the old context group could remove the new context group. For example,D
context group removesA
toC
context groups, and thenC
context group removes groups until it finds itself. But there is noC
in the context group chain. As a result,C
removesD
, the newest and last one.This situation will rarely occur when:
HealthCheckedEndpointGroup
is not initialized yet.HealthCheckContextGroup
that shares allHttpHealthChecker
s with the previous one.Modifications:
HealthCheckContextGroup
with a context group if it was removed before.DefaultHealthCheckerContext
is incorrectly counted if there are duplicate endpoints.HealthCheckStrategy
instead of using the original endpoints updated by the delegate.Result:
You no longer see
EndpointSelectionTimeoutException
whenHealthCheckedEndpointGroup
is initialized with duplicateEndpoint
s.