Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support STRICT_DNS, HealthCheck for XdsEndpointGroup (xDS-endpoint pt 3) #5503

Merged
merged 7 commits into from Apr 2, 2024

Conversation

jrhee17
Copy link
Contributor

@jrhee17 jrhee17 commented Mar 13, 2024

This PR attempts to support DNS, Health Checked clusters. This is done by converting ClusterSnapshot to an EndpointGroup via XdsEndpointUtil#convertEndpoints. The rest of the changes are simply refactoring.

Motivation:

There are three major points of change:

  1. XdsEndpointUtil#convertEndpointGroup has been introduced which converts a ClusterSnapshot into an EndpointGroup.

This method essentially converts a ClusterSnapshot into a DnsAddressEndpointGroup or a HealthCheckedEndpointGroup. Note that only basic functionality is supported at this stage.

During this implementation, I found it convenient to refer to a LbEndpoint or LocalityLbEndpoints for a particular Endpoint. For this reason, XdsAttributeKeys, XdsAttributeAssigningEndpointGroup has been introduced so that xDS resource information can be retrieved easily. This is useful later as well, in order to fetch the locality, priority, health, etc. of a particular endpoint.

  1. New data structures have been introduced to match that of envoy's.
    Some include:

The nuances are slightly different due to a difference in existing APIs, difference in threading model and lifecycles. However, an effort has been made to keep the implementation relatively similarly looking.

Note that the logic inside SubsetLoadBalancer is changed minimally to keep the changeset small.

  1. The way updates are propagated has changed for XdsEndpointGroup.

ClusterManager now receives updates on the ListenerSnapshot by listening to the provided ListenerRoot. On each snapshot update, a ClusterEntry is created for each ClusterSnapshot. ClusterEntry constructs the EndpointGroup based on the provided ClusterSnapshot and endpoint updates (dns/health checked EndpointGroups may update multiple times).

Whenever 1) A ClusterEntry#endpointGroup's endpoints are updated or 2) ListenerSnapshot is updated, then ClusterManager#notifyListeners calls XdsEndpointGroup#setEndpoints. This will eventually trigger XdsEndpointSelector#selectNow. This will subsequently trigger ClusterManager#selectNow and LoadBalancer#selectNow.

Miscellaneous changes

  • XdsEndpointGroup implements EndpointGroup directly since custom comparison logic is required for XdsEndpointGroup#updateState
  • I think we will eventually need to turn Endpoint into an interface, and supported various types of endpoints (e.g. HealthCheckedEndpoint, XdsEndpoint, etc..). For now, I propose to use attributes for simplicity.
  • Because ClusterManager#notifyListeners is called for any update (i.e. update to the snapshot, DnsAddressEndpointGroup updates, etc..) there is a good chance that empty endpoints will be set. This isn't ideal based on the normal use case where users usually wait for initial endpoints via EndpointGroup#whenComplete. I've set allowEmptyEndpoints = false to be the default.

POC: #5450

Modifications:

  • Introduce XdsEndpointUtil#convertEndpointGroup, which creates an EndpointGroup based on the ClusterSnapshot. DNS, HealthCheck capabilities are partially supported.
  • LoadBalancer, ClusterManager, ClusterEntry, PrioritySet, SubsetLoadBalancer are newly introduced.
  • ClusterManager listens for any updates regarding Endpoints or Snapshots and notifies EndpointGroup if there are any changes. Every time this occurs, LoadBalancer#selectNow is called.

Result:

  • XdsEndpointGroup now partially supports DNS and health check.

@jrhee17 jrhee17 changed the title Support STRICT_DNS, HealthCheck for XdsEndpointGroup ((xDS-endpoint pt 3) Support STRICT_DNS, HealthCheck for XdsEndpointGroup (xDS-endpoint pt 3) Mar 14, 2024
@jrhee17 jrhee17 force-pushed the feature/xds-endpoints-3 branch 4 times, most recently from 07f880a to 874016a Compare March 14, 2024 14:14
@jrhee17 jrhee17 added this to the 1.28.0 milestone Mar 14, 2024
@jrhee17 jrhee17 marked this pull request as ready for review March 14, 2024 15:11
@jrhee17
Copy link
Contributor Author

jrhee17 commented Mar 25, 2024

Ready for review 😄

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Left minor comments.

endpoint = Endpoint.of(hostname)
.withIpAddr(socketAddress.getAddress())
.withAttr(XdsAttributesKeys.LB_ENDPOINT_KEY, lbEndpoint)
.withAttr(XdsAttributesKeys.LOCALITY_LB_ENDPOINTS_KEY, localityLbEndpoints);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q) Should we also set weight for this Endpoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦


static int endpointWeight(LbEndpoint lbEndpoint) {
return lbEndpoint.hasLoadBalancingWeight() ?
Math.max(1, lbEndpoint.getLoadBalancingWeight().getValue()) : 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this.endpoints = ImmutableList.copyOf(endpoints);
}

ClusterLoadAssignment clusterLoadAssignment() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is unused. Did you add it for the future usage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I eventually will, but I'm committed to splitting PRs so let me remove it 😄

Copy link

codecov bot commented Mar 26, 2024

Codecov Report

Attention: Patch coverage is 71.92982% with 96 lines in your changes are missing coverage. Please review.

Project coverage is 74.07%. Comparing base (7f30250) to head (cdb5d24).
Report is 6 commits behind head on main.

❗ Current head cdb5d24 differs from pull request most recent head a504ec9. Consider uploading reports for the commit a504ec9 to get more accurate results

Files Patch % Lines
...rp/armeria/xds/client/endpoint/ClusterManager.java 68.42% 25 Missing and 11 partials ⚠️
...p/armeria/xds/client/endpoint/XdsEndpointUtil.java 61.11% 20 Missing and 15 partials ⚠️
.../armeria/xds/client/endpoint/XdsEndpointGroup.java 79.54% 5 Missing and 4 partials ⚠️
...rmeria/xds/client/endpoint/SubsetLoadBalancer.java 70.37% 4 Missing and 4 partials ⚠️
...corp/armeria/xds/client/endpoint/MetadataUtil.java 76.19% 3 Missing and 2 partials ⚠️
...corp/armeria/xds/client/endpoint/ClusterEntry.java 84.21% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5503      +/-   ##
============================================
+ Coverage     74.05%   74.07%   +0.01%     
- Complexity    20854    20923      +69     
============================================
  Files          1807     1815       +8     
  Lines         76757    77010     +253     
  Branches       9792     9823      +31     
============================================
+ Hits          56842    57044     +202     
- Misses        15292    15319      +27     
- Partials       4623     4647      +24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! 🚀🚀

Comment on lines 71 to 78
endpointGroup.removeListener(this);
endpointGroup.removeListener(clusterManager);
return endpointGroup.closeAsync();
}

@Override
public void close() {
endpointGroup.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q) Is endpointGroup.removeListener(...) unnecessary for close()?

private State state() {
final Map<ClusterSnapshot, ClusterEntry> clusterEntries = this.clusterEntries;
if (clusterEntries.isEmpty()) {
return new State(listenerSnapshot, Collections.emptyList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid additional object allocations. EmptyList in JDK creates an empty array when .toArray() is called. In addition, the call path inside ImmutableList.copyOf() is shorter.

Suggested change
return new State(listenerSnapshot, Collections.emptyList());
return new State(listenerSnapshot, ImmutableList.of());

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left only minor suggestion.
Looks great!

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! 💯 💯 💯

…sterManager.java

Co-authored-by: minux <songmw725@gmail.com>
@jrhee17 jrhee17 merged commit 2cc50a1 into line:main Apr 2, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants