xds: skip DiscoveryRequest for unsubscribed types on stream ready#12782
Merged
kannanjgithub merged 1 commit intoMay 6, 2026
Merged
Conversation
When the bootstrap declares more than one xDS server (xDS Federation, A47), ControlPlaneClient.adjustResourceSubscription emitted a DiscoveryRequest for every globally-subscribed resource type when the ADS stream became ready -- including types that have no subscription on this server. Authority-specific servers (e.g. an EDS-only control plane) reply UNIMPLEMENTED to types they do not handle, which tears down the stream before the legitimate request can complete. Skip the request when both: 1. getSubscribedResources returns null. Per the ResourceStore contract, null means "no subscription"; an empty collection means wildcard subscription and must still go on the wire so the watcher's missing-resource timers start. 2. No DiscoveryRequest of this type has been sent on the current stream, tracked in a per-stream sentTypes set on AdsStream. Per-stream is the right scope: the server's view of our subscriptions is per-stream, and the set clears implicitly when the stream is replaced. Gating on versions.containsKey(type) would be incorrect because versions is only populated on ACK -- a watch canceled after the initial DiscoveryRequest but before any ACK would have its empty unsubscribe suppressed, leaving the server with a stale subscription until the stream resets. Mirrors grpc-go's adsStreamImpl.sendExisting, which skips types where len(state.subscribedResources) == 0 on stream re-establishment.
|
|
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes xDS federation behavior in ControlPlaneClient so that when an ADS stream becomes ready, grpc-java does not send DiscoveryRequests for resource types that have no subscription on that specific server, avoiding UNIMPLEMENTED errors from authority-specific (e.g., EDS-only) control planes that would otherwise tear down the stream.
Changes:
- Add a per-ADS-stream
sentTypestracker and use it to skip sending initial emptyDiscoveryRequests for unsubscribed types on stream ready. - Preserve correct unsubscribe behavior by still sending empty requests when a type was previously sent on the stream (including the cancel-before-ACK window).
- Add unit tests covering federation, wildcard subscription, and unsubscribe timing behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
xds/src/main/java/io/grpc/xds/client/ControlPlaneClient.java |
Skips initial requests for types with null subscription on a server unless that type was previously sent on the current stream; tracks per-stream sent types. |
xds/src/test/java/io/grpc/xds/client/ControlPlaneClientTest.java |
Adds focused unit tests validating the new skip behavior, wildcard handling, and cancel-before-ACK unsubscribe correctness. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Contributor
|
/gcbrun |
kannanjgithub
approved these changes
May 5, 2026
Contributor
Author
|
@kannanjgithub |
AgraVator
pushed a commit
to AgraVator/grpc-java
that referenced
this pull request
May 11, 2026
…pc#12782) ## TL;DR When the bootstrap declares more than one xDS server (e.g. a default server for LDS/CDS plus an authority-specific EDS-only server), grpc-java was sending CDS/LDS DiscoveryRequests to the EDS-only server too. That server replies `UNIMPLEMENTED`, which tears down the stream and EDS data never arrives. Fix: skip DiscoveryRequests for resource types we don't actually subscribe to on a given server. ## Problem Under [A47 — xDS Federation](https://github.com/grpc/proposal/blob/master/A47-xds-federation.md), authorities can declare their own `xds_servers` block in the bootstrap. When an ADS stream is opened to an authority-specific server, `ControlPlaneClient.adjustResourceSubscription` sends a `DiscoveryRequest` for every **globally-subscribed** resource type — even types that have no subscription for *this* server. The empty request still carries a `type_url`, and an authority-specific server (e.g. an EDS-only control plane) may reject it with `UNIMPLEMENTED`, which tears down the entire stream before the legitimate request that follows can complete. ## Reproduce Bootstrap declares two servers — call them control-plane-A (handles LDS/CDS for authority A) and control-plane-B (handles EDS only for authority B). A grpc-java channel that resolves through LDS → CDS in authority A and EDS in authority B opens an ADS stream to control-plane-B. When that stream becomes ready, [`sendDiscoveryRequests`](https://github.com/grpc/grpc-java/blob/master/xds/src/main/java/io/grpc/xds/client/ControlPlaneClient.java) iterates `resourceStore.getSubscribedResourceTypesWithTypeUrl()` — which returns **all three** types (Listener, Cluster, Endpoint) — and calls `adjustResourceSubscription` for each. For Listener and Cluster, `getSubscribedResources(serverInfo, type)` returns null/empty, but the request is still sent on the wire: ``` io.grpc.StatusRuntimeException: UNAVAILABLE: Error retrieving EDS resource ...: UNIMPLEMENTED. Details: Watches for type type.googleapis.com/envoy.config.cluster.v3.Cluster are not supported in this service ``` The Cluster (CDS) request reaches control-plane-B, gets rejected, and the stream goes into backoff with no EDS data ever delivered. grpc-go on the same bootstrap works fine against the same server, which pointed at the asymmetry. ## Fix In `adjustResourceSubscription`, return early when both: 1. `resources` is `null` (the store reports no subscription for this type on this server), and 2. No DiscoveryRequest of this type has been sent on the current stream (`!adsStream.sentTypes.contains(resourceType)`). Per the `ResourceStore` contract in `XdsClient.java`, a `null` return means "no subscription", while an empty collection means a **wildcard** subscription — a real subscription that must still emit the empty `resource_names` request and start its missing-resource timers. The "DiscoveryRequest of this type has been sent on the current stream" discriminator is tracked in a per-stream `sentTypes` set on `AdsStream`, populated wherever a request is actually transmitted (initial sends, ACKs, NACKs). Per-stream is the right scope because the server's view of our subscriptions is per-stream — on stream replacement the set is cleared implicitly along with the `AdsStream` instance. Gating on `!versions.containsKey(type)` instead would be incorrect because `versions` is only populated on ACK. If a watch is canceled after the initial DiscoveryRequest goes out but before any response is ACKed, that guard would suppress the empty unsubscribe — leaving the server with a stale subscription until the stream resets. Tracking actual sends per-stream closes that window. The unsubscribe-all path (had-version, now-empty/null) is preserved: we still send the empty request and clear the version, telling the server to drop our subscription for that type. ## Mirrors grpc-go This brings grpc-java in line with grpc-go's behavior. The Go equivalent is [`adsStreamImpl.sendExisting`](https://github.com/grpc/grpc-go/blob/v1.80.0/internal/xds/clients/xdsclient/ads_stream.go#L335-L368): ```go for typ, state := range s.resourceTypeState { state.nonce = "" if len(state.subscribedResources) == 0 { continue // <-- explicit skip } names := resourceNames(state.subscribedResources) if err := s.sendMessageLocked(stream, names, typ.TypeURL, state.version, state.nonce, nil); err != nil { return err } s.startWatchTimersLocked(typ, names) } ``` Two structural reasons grpc-go avoids this bug today: 1. The iteration domain is **per-stream** ([`s.resourceTypeState`](https://github.com/grpc/grpc-go/blob/v1.80.0/internal/xds/clients/xdsclient/ads_stream.go#L143)), populated only when [`subscribe`](https://github.com/grpc/grpc-go/blob/v1.80.0/internal/xds/clients/xdsclient/ads_stream.go#L167-L193) is called for a resource on this stream — so a Cluster type never even appears in the iteration of an EDS-only stream. 2. Even within that per-stream iteration, the explicit `if len(state.subscribedResources) == 0 { continue }` covers the case where the type has no subscription on this stream. The grpc-java fix is the equivalent of (2). The `!adsStream.sentTypes.contains` guard is needed because Java's iteration domain is global (`getSubscribedResourceTypesWithTypeUrl` is xds-client-wide), so we may see types we never subscribed to on this stream. Note that grpc-go physically separates two paths: [`sendNewLocked`](https://github.com/grpc/grpc-go/blob/v1.80.0/internal/xds/clients/xdsclient/ads_stream.go#L308-L318) handles runtime sub/unsub and sends every queued request unconditionally (so an empty unsubscribe always goes on the wire, ACK or no ACK), while `sendExisting` handles stream re-establishment and applies the `len == 0` skip. grpc-java has a single `adjustResourceSubscription` function that serves both paths — the per-stream `sentTypes` set is what lets the same guard distinguish them: on a fresh stream `sentTypes` is empty so the guard reduces to "no subscription → skip" (mirroring `sendExisting`), while at runtime after the initial request `sentTypes.contains(type)` is true so the guard does not trigger and the empty unsubscribe is sent (mirroring `sendNewLocked`). ## Test plan Unit tests in `ControlPlaneClientTest`: - `streamReady_skipsEmptyDiscoveryRequestForUnsubscribedType` — the federation case, asserts CDS request is suppressed and EDS still goes through - `streamReady_sendsRequestForAllTypesWhenAllHaveResources` — guards against over-eager skip - `streamReady_skipsTypeWithNoSubscription` — `null` return skips - `streamReady_sendsWildcardRequestAndStartsTimers` — empty collection (wildcard) still sends and starts timers - `cancelBeforeAck_sendsEmptyUnsubscribe` — cancel-before-ACK timing window still emits the unsubscribe ## Spec note xDS SoTW spec ([Envoy xDS protocol](https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol)) treats an empty `resource_names` for LDS/CDS as a wildcard subscription ("send me everything of this type"). The previous grpc-java behavior would unintentionally trigger wildcard CDS subscriptions on every authority-specific stream — which an EDS-only server is right to refuse. Skipping when no subscription exists side-steps that misinterpretation; legitimate wildcard subscriptions (empty collection from a real subscription) still go on the wire as intended, and the existing unsubscribe-all path (with a prior version) continues to work.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
When the bootstrap declares more than one xDS server (e.g. a default server for LDS/CDS plus an authority-specific EDS-only server), grpc-java was sending CDS/LDS DiscoveryRequests to the EDS-only server too. That server replies
UNIMPLEMENTED, which tears down the stream and EDS data never arrives. Fix: skip DiscoveryRequests for resource types we don't actually subscribe to on a given server.Problem
Under A47 — xDS Federation, authorities can declare their own
xds_serversblock in the bootstrap. When an ADS stream is opened to an authority-specific server,ControlPlaneClient.adjustResourceSubscriptionsends aDiscoveryRequestfor every globally-subscribed resource type — even types that have no subscription for this server. The empty request still carries atype_url, and an authority-specific server (e.g. an EDS-only control plane) may reject it withUNIMPLEMENTED, which tears down the entire stream before the legitimate request that follows can complete.Reproduce
Bootstrap declares two servers — call them control-plane-A (handles LDS/CDS for authority A) and control-plane-B (handles EDS only for authority B).
A grpc-java channel that resolves through LDS → CDS in authority A and EDS in authority B opens an ADS stream to control-plane-B. When that stream becomes ready,
sendDiscoveryRequestsiteratesresourceStore.getSubscribedResourceTypesWithTypeUrl()— which returns all three types (Listener, Cluster, Endpoint) — and callsadjustResourceSubscriptionfor each. For Listener and Cluster,getSubscribedResources(serverInfo, type)returns null/empty, but the request is still sent on the wire:The Cluster (CDS) request reaches control-plane-B, gets rejected, and the stream goes into backoff with no EDS data ever delivered. grpc-go on the same bootstrap works fine against the same server, which pointed at the asymmetry.
Fix
In
adjustResourceSubscription, return early when both:resourcesisnull(the store reports no subscription for this type on this server), and!adsStream.sentTypes.contains(resourceType)).Per the
ResourceStorecontract inXdsClient.java, anullreturn means "no subscription", while an empty collection means a wildcard subscription — a real subscription that must still emit the emptyresource_namesrequest and start its missing-resource timers.The "DiscoveryRequest of this type has been sent on the current stream" discriminator is tracked in a per-stream
sentTypesset onAdsStream, populated wherever a request is actually transmitted (initial sends, ACKs, NACKs). Per-stream is the right scope because the server's view of our subscriptions is per-stream — on stream replacement the set is cleared implicitly along with theAdsStreaminstance.Gating on
!versions.containsKey(type)instead would be incorrect becauseversionsis only populated on ACK. If a watch is canceled after the initial DiscoveryRequest goes out but before any response is ACKed, that guard would suppress the empty unsubscribe — leaving the server with a stale subscription until the stream resets. Tracking actual sends per-stream closes that window.The unsubscribe-all path (had-version, now-empty/null) is preserved: we still send the empty request and clear the version, telling the server to drop our subscription for that type.
Mirrors grpc-go
This brings grpc-java in line with grpc-go's behavior. The Go equivalent is
adsStreamImpl.sendExisting:Two structural reasons grpc-go avoids this bug today:
s.resourceTypeState), populated only whensubscribeis called for a resource on this stream — so a Cluster type never even appears in the iteration of an EDS-only stream.if len(state.subscribedResources) == 0 { continue }covers the case where the type has no subscription on this stream.The grpc-java fix is the equivalent of (2). The
!adsStream.sentTypes.containsguard is needed because Java's iteration domain is global (getSubscribedResourceTypesWithTypeUrlis xds-client-wide), so we may see types we never subscribed to on this stream.Note that grpc-go physically separates two paths:
sendNewLockedhandles runtime sub/unsub and sends every queued request unconditionally (so an empty unsubscribe always goes on the wire, ACK or no ACK), whilesendExistinghandles stream re-establishment and applies thelen == 0skip. grpc-java has a singleadjustResourceSubscriptionfunction that serves both paths — the per-streamsentTypesset is what lets the same guard distinguish them: on a fresh streamsentTypesis empty so the guard reduces to "no subscription → skip" (mirroringsendExisting), while at runtime after the initial requestsentTypes.contains(type)is true so the guard does not trigger and the empty unsubscribe is sent (mirroringsendNewLocked).Test plan
Unit tests in
ControlPlaneClientTest:streamReady_skipsEmptyDiscoveryRequestForUnsubscribedType— the federation case, asserts CDS request is suppressed and EDS still goes throughstreamReady_sendsRequestForAllTypesWhenAllHaveResources— guards against over-eager skipstreamReady_skipsTypeWithNoSubscription—nullreturn skipsstreamReady_sendsWildcardRequestAndStartsTimers— empty collection (wildcard) still sends and starts timerscancelBeforeAck_sendsEmptyUnsubscribe— cancel-before-ACK timing window still emits the unsubscribeSpec note
xDS SoTW spec (Envoy xDS protocol) treats an empty
resource_namesfor LDS/CDS as a wildcard subscription ("send me everything of this type"). The previous grpc-java behavior would unintentionally trigger wildcard CDS subscriptions on every authority-specific stream — which an EDS-only server is right to refuse. Skipping when no subscription exists side-steps that misinterpretation; legitimate wildcard subscriptions (empty collection from a real subscription) still go on the wire as intended, and the existing unsubscribe-all path (with a prior version) continues to work.