Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: reimplement EDS LB policy with downstream LB config generations that migrate to hierarchical LB tree codepath #7391

Merged

Conversation

voidzcy
Copy link
Contributor

@voidzcy voidzcy commented Sep 3, 2020

No description provided.

@voidzcy voidzcy force-pushed the impl/migrate_to_hierarchical_downstream_lb_tree branch from 2db7335 to ae61247 Compare September 3, 2020 19:01
@voidzcy voidzcy force-pushed the impl/migrate_to_hierarchical_downstream_lb_tree branch 2 times, most recently from a6b6206 to 1977109 Compare September 3, 2020 19:25
@voidzcy voidzcy marked this pull request as ready for review September 3, 2020 19:25
@Nullable
private final String edsServiceName;
@Nullable
private final String lrsServerName; // assume LRS server never change
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is not necessary as with edsServiceName

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

if (lrsServerName != null) {
loadStatsStore = xdsClient.addClientStats(cluster, edsServiceName);
xdsClient.reportClientStats();
attributes =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute should be updated each time handleResolvedAddresses().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I removed the separate caching of attributes. Instead, always populate from the freshest ResolvedAddresses.

if (lb != null) {
lb.handleNameResolutionError(error);
} else {
helper.updateBalancingState(TRANSIENT_FAILURE, new ErrorPicker(error));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wrong helper. The right helper should be the helper from the constructor ChildLbState(Helper helper) that the GracefulSwitchingLoadBalancer passes to it, which could be a pendingHelper that does not propagate TRANSIENT_FAILURE upstream. Even forget about the above technical details, the right helper should still be the wrapped helper in the direct layer, and allow that layer to intercept update.

Same for onResourceDoesNotExist() and onError().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This is a subtle one. Fixed.

@voidzcy voidzcy force-pushed the impl/migrate_to_hierarchical_downstream_lb_tree branch from 380af00 to fe8372f Compare September 11, 2020 02:05
@@ -105,7 +105,9 @@ public void handleResolvedAddresses(ResolvedAddresses resolvedAddresses) {
if (cluster == null) {
cluster = config.clusterName;
}
switchingLoadBalancer.switchTo(new EdsLbState(config.edsServiceName, config.lrsServerName));
switchingLoadBalancer.switchTo(new EdsLbState(
config.edsServiceName, config.lrsServerName, config.localityPickingPolicy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks more than a fix. You changed the definition of EdsLbState, why? But EdsLbState equals()andhashCode()` are not changed accordingly.

Copy link
Contributor Author

@voidzcy voidzcy Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's done purposely. EdsLbState is a LoadBalancer.Factory used for switching during the eds_service_name or lrs_server_name change (aka, replacing cluster). But we don't treat changing endpoint_picking_policy or locality_picking_policy as replacing cluster. So they are not part of equality of EdsLbState. These two fields are mutable and are moved to EdsLbState from ChildLbState to avoid the watcher reentrancy issue: ChildLbState is subscribing to some endpoint resource, while the addWatcher operation can invoke the callback inline (reentrant). So it is too late to set up ChildLbState's endpoint_picking_policy and locality_picking_policy in its handleResolvedAddresses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so comfortable having these mutable fields in a Factory. It looks not elegant. Is it possible to move xdsClient.watchEndpointData() in handleResolvedAddresses() to avoid reentrancy issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a couple of different approaches, none of them seems to be satisfactory. handleResolvedAddresses is called for multiple times while xdsClient.watchEndpointData() is only called once. Moving it into handleResolvedAddresses is cumbersome. From another aspect, EdsLbState is not just a LoadBalancer.Factory, it implements LoadBalancer.Factory because we have to (because of the usage of GracefulSwitchLoadBalancer API), but it's also representing the state of EdsLoadBalancer: a running instance for one EdsConfig (eds_service_name, lrs_server_name, locality_picking_policy, endpoint_picking_policy). From this perspective, it's not that weird to have endpoint_picking_policy and locality_picking_policy saved there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved endpoint_picking_policy and locality_picking_policy into ChildLbState. Delaying the execution of xdsClient.watchEndpointData() outside ChildLbState's constructor to avoid reentrancy may be problematic, as ChildLbState.shutdown() (which cancels the watcher) may happen before current task in the syncContext returns. So it'd be better to keep watcher addition/cancellation in place. Instead, we can just let onEndpointChanged callback not invoking child policy's handleResolvedAddresses() if such a reentrancy happens. After this callback returns, the following invocation of ChildLbState's handleResolvedAddresses() will get the endpoint_picking_policy and locality_picking_policy and it will invoke child policy's handleResolvedAddresses().

@voidzcy voidzcy added the kokoro:force-run Add this label to a PR to tell Kokoro to re-run all tests. Not generally necessary label Sep 16, 2020
@grpc-kokoro grpc-kokoro removed the kokoro:force-run Add this label to a PR to tell Kokoro to re-run all tests. Not generally necessary label Sep 16, 2020
@voidzcy voidzcy closed this Sep 16, 2020
@voidzcy voidzcy reopened this Sep 16, 2020
Comment on lines +165 to +166
loadStatsStore = xdsClient.addClientStats(cluster, edsServiceName);
xdsClient.reportClientStats();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the order matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't really matter. addClientStats creates the stats object for recording loads while reportClientStats creates the LRS RPC stream and performs load reporting. Since the LRS response handling is always serialized in the sync context, at the time sending the next load reporting request it always sees the newly created stats object for this cluster. So, the order of these two lines doesn't affect anything.

Comment on lines +275 to +294
PriorityLbConfig config = generatePriorityLbConfig(cluster, edsServiceName,
lrsServerName, localityPickingPolicy, endpointPickingPolicy, lbRegistry,
prioritizedLocalityWeights);
// TODO(chengyuanzhang): to be deleted after migrating to use XdsClient API.
Attributes attributes;
if (lrsServerName != null) {
attributes =
resolvedAddresses.getAttributes().toBuilder()
.set(XdsAttributes.ATTR_CLUSTER_SERVICE_LOAD_STATS_STORE, loadStatsStore)
.build();
} else {
attributes = resolvedAddresses.getAttributes();
}
lb.handleResolvedAddresses(
resolvedAddresses.toBuilder()
.setAddresses(endpointAddresses)
.setAttributes(attributes)
.setLoadBalancingPolicyConfig(config)
.build());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this duplicate code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of yes. Lines for putting LoadStatsStore into the attribute is temporary and will be deleted after the XdsClient has API to directly access per-locality stats object. Also, there are really only a few statement, but just involving the verbose builder pattern.

However, I also noticed that it would be better to make the latest ResolvedAddresses as a field in ChildLbState, as each ChildLbState instance operates according to a specific address update, so it should be bound with that ResolvedAddresses.

if (lb != null) {
lb.handleNameResolutionError(error);
} else {
lbHelper.updateBalancingState(TRANSIENT_FAILURE, new ErrorPicker(error));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be this DropHandlingLbHelper lbHelper either, that's for lb with dropping logic, it's too downstream. The right one should be the helper in ChildLbState(Helper helper).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know. It doesn't matter though. DropHandlingLbHelper is just a the helper received in ChildLbState(Helper helper) with a thin layer wrapped to record drops, which doesn't affect its error propagation functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the API's perspective, your thought is legit. What about changing to lbHelper.delegate().updateBalancingState(...)? or lbHelper.helper.updateBalancingState(...)


@Override
public Subchannel createSubchannel(CreateSubchannelArgs args) {
return new FakeSubchannel(args.getAddresses());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just return mock(Subchannel.class). The test does not depend on anything of Subchannel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, right. Fixed.

Comment on lines 536 to 542
LoadBalancerRegistry lbRegistry = LoadBalancerRegistry.getDefaultRegistry();
lbRegistry.deregister(lbRegistry.getProvider(fakeLeafPolicyProvider.getPolicyName()));
lbRegistry.register(fakeLeafPolicyProvider);
loadBalancer.shutdown();
loadBalancer = new EdsLoadBalancer2(helper, lbRegistry, mockRandom);
PolicySelection weightedTargetSelection =
new PolicySelection(lbRegistry.getProvider(WEIGHTED_TARGET_POLICY_NAME), null, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative is: deregister all the fake providers in the setUp(), and re-register those policies with real providers. Then the benefit is you don't need to shutdown the balancer and create a new balancer.

    registry.deregister(registry.getProvider(PRIORITY_POLICY_NAME));
    registry.register(new PriorityLoadBalancerProvider());
    registry.deregister(registry.getProvider(LRS_POLICY_NAME));
    registry.register(new LrsLoadBalancerProvider());
    registry.register(fakeLeafPolicyProvider);
    PolicySelection weightedTargetSelection =
        new PolicySelection(new WeightedTargetLoadBalancerProvider(), null, null);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If using real registry, you need clean up the real registry at the end of test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed to reuse the existing registry with replaced LB providers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... It turns out even without

lbRegistry.register(fakeLeafPolicyProvider);

the test could still pass.

Copy link
Contributor Author

@voidzcy voidzcy Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the leaf policy provider (aka endpoint_picking_policy) is not obtained from the registry, it is obtained from the LB config. So the one in the registry is not used.

Copy link
Member

@dapengzhang0 dapengzhang0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@voidzcy
Copy link
Contributor Author

voidzcy commented Sep 17, 2020

I am gonna test it before merging.

@voidzcy
Copy link
Contributor Author

voidzcy commented Sep 18, 2020

The EDS policy is responsible to sieve usable priorities/localities.

I added some missing special case handlings:

  • Localities with no healthy endpoints should be thrown away and not propagate to downstream policies.
  • Priorities with no usable localities should be thrown away and not propagate to downstream policies.
  • Should turn to TF if no usable priority is found.

PTAL.

@dapengzhang0
Copy link
Member

LGTM

@voidzcy
Copy link
Contributor Author

voidzcy commented Sep 18, 2020

The header matching interop test fails consistently as the gRPC client keeps receiving empty EDS responses.

  • Initially the client receives valid LDS/RDS/CDS responses, but empty EDS responses. After a while, it receives empty CDS responses, and later all LDS/RDS/CDS/EDS responses become empty.

This happens at the initialization stage (aka, setting up the two backend services and waiting endpoints in both backend services to receive RPCs) and the test driver hasn't even apply any header matching specific configurations to the URL map. Interestingly, tests with the same initialization steps (e.g., path matching test) run successfully.

This shouldn't be related to anything done in this change (or even gRPC's implementation). I am merging this and move investigations to the next PR that tries to promote changes in this PR.

@voidzcy voidzcy merged commit e6b61ea into grpc:master Sep 18, 2020
dfawley pushed a commit to dfawley/grpc-java that referenced this pull request Jan 15, 2021
…that migrate to hierarchical LB tree codepath (grpc#7391)

Implemented the new EDS LB policy, which generates a LB config for instantiating a hierarchical load balancing subtree. The subtree includes downstream LB policies: 

- priority LB policy, which load balances individual priorities separately
- weighted-target LB policy, which load balances individual localities within the same priority separately
- lrs LB policy, which applies load recording that aggregates load stats for each locality
- leaf LB policy (round_robin)

The EDS LB policy is the place that receives endpoint information from traffic director and organizes the data into a model for hierarchical load balancing for the cluster.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants