Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify stale nonce handling in xDS protocol spec #10363

Open
markdroth opened this issue Mar 12, 2020 · 5 comments
Open

clarify stale nonce handling in xDS protocol spec #10363

markdroth opened this issue Mar 12, 2020 · 5 comments
Labels
api/v3 Major version release @ end of Q3 2019 design proposal Needs design doc/proposal before implementation help wanted Needs help!

Comments

@markdroth
Copy link
Contributor

Title: Clarify stale nonce handling in xDS protocol spec

Description:
The xDS protocol spec currently says the following:

The management server should not send a DiscoveryResponse for any DiscoveryRequest that has a stale nonce. A nonce becomes stale following a newer nonce being presented to Envoy in a DiscoveryResponse.

This means that if the server sends an update to the client at the same moment that the client sends a new request changing the resource_names list, the client's request will be ignored by the server. The expectation here seems to be that this will not actually cause a problem, because the server will see the updated resource_names list when the client ACKs the response that the server just sent.

This approach could cause a problem in certain edge cases. For example, if the server is sending updates at a fixed interval that happens to exactly match the RTT to the client, then every time the client sends an ACK, it will be considered stale by the time it arrives at the server. In this case, the server will never see the updated resource_names list sent by the client.

This is admittedly probably fairly unlikely, but it would be good to clarify the protocol somehow to avoid this.

When I talked with @htuch about this, he said that management servers should ideally make sure that they only have one outstanding response to a client at any given time. In other words, a management server should ideally not send a new response until it sees the ACK/NACK for the previous response. However, the protocol spec does not say anything about requiring this.

One possible way to deal with this would be to change the spec to say that management servers must either (a) have only one outstanding response to a client at any given time or (b) not ignore requests with stale nonces. In other words, if the management server is not doing something to ensure that most of the client's requests will be non-stale, it cannot ignore stale requests.

@htuch htuch added api/v3 Major version release @ end of Q3 2019 design proposal Needs design doc/proposal before implementation help wanted Needs help! and removed area/configuration labels Mar 13, 2020
@htuch
Copy link
Member

htuch commented Mar 13, 2020

@markdroth thanks for the writeup, I think that captures the crux of the conversation. It's probably worth clarifying, I would use the wording "should" rather than "must" though, unless we want to reserve this for future versions, since existing management servers/clients aren't necessarily aware of this spec change and probably are happy today being oblivious to the issue.

Do you want to do a PR? Does anyone else in the management plan community care to chip in?

@envoyproxy/go-control-plane @envoyproxy/java-control-plane @envoyproxy/api-shepherds

@markdroth
Copy link
Contributor Author

I'd be happy to put together a PR once we have consensus on what it should say.

@slonka
Copy link
Member

slonka commented Mar 17, 2020

@lukidzi - you recently worked on edge cases with resource_names and stale nonce, can you weight in on this?

@markdroth
Copy link
Contributor Author

Looks like we never followed up on this. Harvey, should I just put together a PR along the lines of what we discussed above?

@htuch
Copy link
Member

htuch commented Jan 28, 2021

Sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api/v3 Major version release @ end of Q3 2019 design proposal Needs design doc/proposal before implementation help wanted Needs help!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants