New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xds: weightedTargetLB do a pick to get more failure details improving logs #7989
Conversation
@@ -131,20 +135,28 @@ private void updateOverallBalancingState() { | |||
List<WeightedChildPicker> childPickers = new ArrayList<>(); | |||
|
|||
ConnectivityState overallState = null; | |||
Status errorStatus = Status.UNAVAILABLE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this Status.UNAVAILABLE
should never actually be used. Initialize the Status to null instead. It is normally wrong to create a "bare" status with just a code; almost universally it should have at least a description as well.
getStreamAggregatedResourcesMethod(), | ||
new Metadata(), | ||
CallOptions.DEFAULT.withOption(XdsNameResolver.CLUSTER_SELECTION_KEY, name)); | ||
errorStatus = childHelper.currentPicker.pickSubchannel(args).getStatus(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should save the status like this. It seems like it could easily be out-of-date, and we have to manufacture an RPC pick to trigger it which is asking for bugs, as higher-level policies wouldn't be able to modify this pick.
It would be much better to use the failing picker directly instead of ErrorPicker
. Although, I still think it'd be even better to use WeightedRandomPicker with all the failing transports. We could have a List<WeightedChildPicker> errorChildPickers
and use it to create WeightedRandomPicker if overallState == TRANSIENT_FAILURE
.
@@ -17,18 +17,22 @@ | |||
package io.grpc.xds; | |||
|
|||
import static com.google.common.base.Preconditions.checkNotNull; | |||
import static io.envoyproxy.envoy.service.discovery.v2.AggregatedDiscoveryServiceGrpc.getStreamAggregatedResourcesMethod; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is introduced? This seems to be completely wrong...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i was leveraging any methodDescriptor
to do a dummy pick, not supposed to be useful, i know it was probably too error prone. ejona@ seems to have a better idea.
Fix #7952