rpc: make blocking queries for non-existent items more efficient #12110

dnephin · 2022-01-17T22:41:43Z

Branched from #12109, fixes #9449

Full details about this change are in the commit messages. This is a problem that has been reported a few times. It is particularly problematic when connect is enabled, because agents will do many blocking queries for config-entries that often do not exist.

This PR improves the behaviour of blocking queries when the result of the query is "no items found" (no modify index to use as the QueryMeta.Index). This change makes blockingQuery more efficient, but there is still possibility for improvement. Currently we have to return on the first iteration if there was a write between the time the previous RPC call was returned and the new one was received (because the table max index would have changed). We can fix that case by adding a new query option, which I will do in a follow up PR.

TODO:

more test cases for state changes between existing->not-existing and not-existing->existing
return errNotFound from all the RPC endpoint functions passed to blockingQuery (where appropriate), so that all queries benefit from this enhancement. Done for all "get" endpoints. List endpoints will be left for a future PR as those are more involved.
changelog entry

dnephin · 2022-01-17T22:45:49Z

agent/consul/config_endpoint.go

@@ -142,7 +142,7 @@ func (c *ConfigEntry) Get(args *structs.ConfigEntryQuery, reply *structs.ConfigE

 			reply.Index = index
 			if entry == nil {
-				return nil
+				return errNotFound


This is the change that needs to be applied to all the other RPC endpoints to take advantage of this enhancement. Currently it is only implemented on this one ConfigEntry.Get endpoint.

banks · 2022-01-25T12:28:02Z

Nice Daniel! This is a really smart way to save a lot of CPU time in this common case without invasive changes to how we store things!

If I understand correctly, this saves on the CPU overhead of building a response, serializing it, writing to the wire, completing the request, and then accepting a new request from the client, deserialising etc.

The one part that remains that potentially costs something is the re-query of the state store for every change. But I expect that to be a very small portion of the CPU usage end to end here since in this case the query should be a single index lookup that returns no results and so a quick exit.

Did you see any data while debugging that indicated the proportion of CPU that could be saved here vs the internal query parts? No worries if not just interested.

For comparison, where this query work was an issue in the past it was for service health with a large number of instances because then there was O(N) work to do joining all the node, check and instance information to build the result, so when N got large, and many watchers were all doing this in a semi-hot loop (on every service update) it got really expensive. (That's different from the empty case but amounted to the same thing thanks to our limit of 2048 WatchSet chans after which we fell back to whole-table index watching, we fixed this with per-service indexes specifically for Health queries.)

So yeah this seems like a really simple change that should be a very significant CPU saving in this common case 👏

dnephin · 2022-01-25T18:01:39Z

Yes, exactly! I have not measured the cost of the extra query, but I don't expect that to be significant. I think it's worth mentioning that this extra cost already exists in all of our blocking queries. I believe the WatchSet can fire any time a node is modified because of a delete or insert. The frequency of these fires is hard to predict, but it should be fair to say it happens often enough.

Any time the WatchSet fires due to a node update, we'd have the same behaviour as you describe. There would be an extra state store query, but the returned index would not have changed (because it should generally be the ModifyIndex), so we would never notice those queries in our logs or metrics.

I think given that this already happens today in normal blocking-queries, there's no real concern that it will continue to happen for blocking-queries for non-existent entries.

rboyer · 2022-02-14T20:32:05Z

agent/consul/kvs_endpoint.go

-				} else {
-					reply.Index = index
-				}
+				reply.Index = index


Presumably this part of the change was intentional?

Ya, this is now handled generally in Server.blockingQuery by calling Server.setQueryMeta, and setting the value to 1 if it is still 0. I guess this logic in kvs_endpoint predated the general fix in blockingQuery.

rboyer · 2022-02-15T20:52:46Z

agent/consul/rpc.go

-		if err != nil {
+		switch {
+		case errors.Is(err, errNotFound):
+			if notFound {


Why do we need var notFound bool? It seems like it moots the whole point, because the first errNotFound skips the conditional here, and proceeds exactly like the prior scenario, never executing line 978.

Good question!

I believe what you are calling out is that on the first iteration of this loop we still rely on the index not having changed. If a write happens between the time the last blocking query returned and when the next blocking query request happens (which I think should generally only be a few milliseconds), then this optimization does not work. We continue to have the old behaviour of returning unnecessarily. Generally that should be ok because it only limits the optimization, should not impact correctness, and there is a very small window where a write must happen to cause this problem.

There is a way to work around this limitation (could be a follow up), but it introduces some risk, so I thought it would be better to avoid. I haven't documented that anywhere yet (just some discussions in old slack threads), but I can do that here.

The work around for this limitation would be to add a new X-Consul-Index-Not-Found query parameter (as a sibling to X-Consul-Index). This parameter would be used to set the initial value of var notFound bool. That way when a new query comes in, we'll already know if we're in a "not found" state, and only the creation of an item (or a timeout) will cause blockingQuery to return. The risk here is that if the user incorrectly sets this header, then they could miss a delete that happened, but I haven't thought enough about this case to know if it's actually a problem for correctness. Maybe we should do this as a follow up.

Why do we need var notFound bool?

To answer the other part of this question, this is essential to the optimization. I think this is explained briefly in 9abb49da51ad87312a5f34364d8e6a3c0e35c328, but I'm happy to add more detail since this is the main reason for the change. As documented in #12343, there are two things to consider here: the "last modified index" set on responseMeta.Index, and the WatchSet blocking.

We can't fix the WatchSet blocking without turning every read operation into a write that modifies the memdb index (I'm pretty sure this is not an approach we want to investigation). Maybe there are other ways of making WatchSet block correctly for not found items? I have not found any.

So that leaves the index comparison (responseMeta.Index > query.MinIndex). Generally we know this comparison will return true in these "not found" cases, because responseMeta.Index will be the max index of the table, and it changes on every write.

What we need is a comparison we can use instead of responseMeta.Index > query.MinIndex.

If we think about this comparison more abstractly, I think we'll see that the question this is asking is "has the result of the query changed from the last result sent to the client". There are plenty of ways that question could be answered, using the modify index is a very cost effective one, so that's generally what we use.

However as we've noticed, using that modify index doesn't work for cases where an item does not exist, because we don't have a stable modify index to compare against. In this case we use "not found" to indicate if the result has changed.

Since we don't know the state from the client perspective (X-Consul-Index-Not-Found would give us that), we have to perform the query once. If the result of the query is the same "last modified index", then we know at that index the query returned no results (which tells us the client's last view was "not found"). Until the query returns some result, the state has not changed, so we can continue to block until that time.

In the line below this, when we see "state has not changed, the result is still not found", we update the minQueryIndex to the latest responseMeta.Index, so that the comparison later keeps us in the for loop.

I opened #12363 for that potential future optimization.

dnephin

I'll rebase this PR, it confliced with some renames from the docs PR.

I'll also update the godoc to document this special errNotFound that the query function can return.

dnephin · 2022-02-15T21:18:37Z

agent/consul/rpc.go

-		if err != nil {
+		switch {
+		case errors.Is(err, errNotFound):
+			if notFound {


Good question!

I believe what you are calling out is that on the first iteration of this loop we still rely on the index not having changed. If a write happens between the time the last blocking query returned and when the next blocking query request happens (which I think should generally only be a few milliseconds), then this optimization does not work. We continue to have the old behaviour of returning unnecessarily. Generally that should be ok because it only limits the optimization, should not impact correctness, and there is a very small window where a write must happen to cause this problem.

There is a way to work around this limitation (could be a follow up), but it introduces some risk, so I thought it would be better to avoid. I haven't documented that anywhere yet (just some discussions in old slack threads), but I can do that here.

The work around for this limitation would be to add a new X-Consul-Index-Not-Found query parameter (as a sibling to X-Consul-Index). This parameter would be used to set the initial value of var notFound bool. That way when a new query comes in, we'll already know if we're in a "not found" state, and only the creation of an item (or a timeout) will cause blockingQuery to return. The risk here is that if the user incorrectly sets this header, then they could miss a delete that happened, but I haven't thought enough about this case to know if it's actually a problem for correctness. Maybe we should do this as a follow up.

Why do we need var notFound bool?

To answer the other part of this question, this is essential to the optimization. I think this is explained briefly in 9abb49da51ad87312a5f34364d8e6a3c0e35c328, but I'm happy to add more detail since this is the main reason for the change. As documented in #12343, there are two things to consider here: the "last modified index" set on responseMeta.Index, and the WatchSet blocking.

We can't fix the WatchSet blocking without turning every read operation into a write that modifies the memdb index (I'm pretty sure this is not an approach we want to investigation). Maybe there are other ways of making WatchSet block correctly for not found items? I have not found any.

So that leaves the index comparison (responseMeta.Index > query.MinIndex). Generally we know this comparison will return true in these "not found" cases, because responseMeta.Index will be the max index of the table, and it changes on every write.

What we need is a comparison we can use instead of responseMeta.Index > query.MinIndex.

If we think about this comparison more abstractly, I think we'll see that the question this is asking is "has the result of the query changed from the last result sent to the client". There are plenty of ways that question could be answered, using the modify index is a very cost effective one, so that's generally what we use.

However as we've noticed, using that modify index doesn't work for cases where an item does not exist, because we don't have a stable modify index to compare against. In this case we use "not found" to indicate if the result has changed.

Since we don't know the state from the client perspective (X-Consul-Index-Not-Found would give us that), we have to perform the query once. If the result of the query is the same "last modified index", then we know at that index the query returned no results (which tells us the client's last view was "not found"). Until the query returns some result, the state has not changed, so we can continue to block until that time.

In the line below this, when we see "state has not changed, the result is still not found", we update the minQueryIndex to the latest responseMeta.Index, so that the comparison later keeps us in the for loop.

This test shows how blocking queries are not efficient when the query returns no results. The test fails with 100+ calls instead of the expected 2. This test is still a bit flaky because it depends on the timing of the writes. It can sometimes return 3 calls. A future commit should fix this and make blocking queries even more optimal for not-found results.

By using the query results as state. Blocking queries are efficient when the query matches some results, because the ModifyIndex of those results, returned as queryMeta.Mindex, will never change unless the items themselves change. Blocking queries for non-existent items are not efficient because the queryMeta.Index can (and often does) change when other entities are written. This commit reduces the churn of these queries by using a different comparison for "has changed". Instead of using the modified index, we use the existence of the results. If the previous result was "not found" and the new result is still "not found", we know we can ignore the modified index and continue to block. This is done by setting the minQueryIndex to the returned queryMeta.Index, which prevents the query from returning before a state change is observed.

Any query that returns a list of items is not part of this commit.

rboyer

LGTM

dnephin · 2022-02-17T23:09:27Z

I'm going to merge without any backport tags for now. I'll need to backport #12109 before this one, otherwise it will never apply.

hc-github-team-consul-core · 2022-02-17T23:10:45Z

🍒 If backport labels were added before merging, cherry-picking will start automatically.

To retroactively trigger a backport after merging, add backport labels and re-run https://circleci.com/gh/hashicorp/consul/588613.

…g entries are involved Starting from and extending the mechanism introduced in #12110 we can specially handle the following special Consul RPC endpoints that react to many config entries in a single blocking query in Connect: - DiscoveryChain.Get - ConfigEntry.ResolveServiceConfig - Intentions.Match - ConfigEntry.List - Internal.IntentionUpstreams All of these will internally watch for many config entries, and at least one of those will likely be not found in any given query. Because these are blends of multiple reads the exact solution from #12110 isn't perfectly aligned, but we can tweak the approach slightly and regain the utility of that mechanism. (A) No Config Entries Found In this case, despite looking for many config entries none may be found at all. Unlike #12110 in this scenario we do not return an empty reply to the caller, but instead synthesize a struct from default values to return. This can be handled nearly identically to #12110 with the first 1-2 replies being non-empty payloads followed by the standard spurious wakeup suppression mechanism from #12110. (B) No Change Since Last Wakeup Once a blocking query loop on the server has completed and slept at least once, there is a further optimization we can make here to detect if any of the config entries that were present at specific versions for the prior execution of the loop are identical for the loop we just woke up for. In that scenario we can return a slightly different internal sentinel error and basically externally handle it similar to #12110. This would mean that even if 20 discovery chain read RPC handling goroutines wakeup due to the creation of an unrelated config entry, the only ones that will terminate and reply with a blob of data are those that genuinely have new data to report.

…g entries are involved (#12362) Starting from and extending the mechanism introduced in #12110 we can specially handle the 3 main special Consul RPC endpoints that react to many config entries in a single blocking query in Connect: - `DiscoveryChain.Get` - `ConfigEntry.ResolveServiceConfig` - `Intentions.Match` All of these will internally watch for many config entries, and at least one of those will likely be not found in any given query. Because these are blends of multiple reads the exact solution from #12110 isn't perfectly aligned, but we can tweak the approach slightly and regain the utility of that mechanism. ### No Config Entries Found In this case, despite looking for many config entries none may be found at all. Unlike #12110 in this scenario we do not return an empty reply to the caller, but instead synthesize a struct from default values to return. This can be handled nearly identically to #12110 with the first 1-2 replies being non-empty payloads followed by the standard spurious wakeup suppression mechanism from #12110. ### No Change Since Last Wakeup Once a blocking query loop on the server has completed and slept at least once, there is a further optimization we can make here to detect if any of the config entries that were present at specific versions for the prior execution of the loop are identical for the loop we just woke up for. In that scenario we can return a slightly different internal sentinel error and basically externally handle it similar to #12110. This would mean that even if 20 discovery chain read RPC handling goroutines wakeup due to the creation of an unrelated config entry, the only ones that will terminate and reply with a blob of data are those that genuinely have new data to report. ### Extra Endpoints Since this pattern is pretty reusable, other key config-entry-adjacent endpoints used by `agent/proxycfg` also were updated: - `ConfigEntry.List` - `Internal.IntentionUpstreams` (tproxy)

dnephin added type/enhancement Proposed improvement or new feature theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics labels Jan 17, 2022

dnephin changed the title ~~rpc: make blocking queries for non-existent items more efficient.~~ rpc: make blocking queries for non-existent items more efficient Jan 17, 2022

github-actions bot removed the theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics label Jan 17, 2022

dnephin force-pushed the dnephin/blocking-queries-not-found branch from 6c81389 to ce99688 Compare January 17, 2022 22:44

vercel bot temporarily deployed to Preview – consul-ui-staging January 17, 2022 22:44 Inactive

vercel bot temporarily deployed to Preview – consul January 17, 2022 22:44 Inactive

dnephin commented Jan 17, 2022

View reviewed changes

freddygv mentioned this pull request Jan 19, 2022

Expand blocking query docs beyond instructions for usage #12115

Open

dnephin added the theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics label Jan 25, 2022

dnephin force-pushed the dnephin/blocking-queries-not-found branch from ce99688 to 4457d6a Compare January 25, 2022 23:51

vercel bot temporarily deployed to Preview – consul-ui-staging January 25, 2022 23:51 Inactive

vercel bot temporarily deployed to Preview – consul January 25, 2022 23:51 Inactive

dnephin marked this pull request as ready for review January 25, 2022 23:51

Base automatically changed from dnephin/blocking-query-1 to main January 26, 2022 23:13

rboyer reviewed Feb 14, 2022

View reviewed changes

dnephin mentioned this pull request Feb 14, 2022

rpc: improve docs for blockingQuery #12343

Merged

dnephin force-pushed the dnephin/blocking-queries-not-found branch from 4457d6a to ae10e60 Compare February 15, 2022 18:44

vercel bot temporarily deployed to Preview – consul February 15, 2022 18:45 Inactive

vercel bot temporarily deployed to Preview – consul-ui-staging February 15, 2022 18:45 Inactive

rboyer reviewed Feb 15, 2022

View reviewed changes

dnephin commented Feb 15, 2022

View reviewed changes

dnephin added 3 commits February 15, 2022 18:23

rpc: add errNotFound to all Get queries

8a6e75a

Any query that returns a list of items is not part of this commit.

dnephin force-pushed the dnephin/blocking-queries-not-found branch from ae10e60 to 8a6e75a Compare February 15, 2022 23:24

vercel bot temporarily deployed to Preview – consul February 15, 2022 23:24 Inactive

vercel bot temporarily deployed to Preview – consul-ui-staging February 15, 2022 23:24 Inactive

dnephin mentioned this pull request Feb 15, 2022

rpc: add a query parameter to set initial value for blockingQuery "not found" #12363

Open

rboyer approved these changes Feb 17, 2022

View reviewed changes

rboyer mentioned this pull request Feb 17, 2022

server: suppress spurious blocking query returns where multiple config entries are involved #12362

Merged

7 tasks

dnephin merged commit 85ecbaf into main Feb 17, 2022

dnephin deleted the dnephin/blocking-queries-not-found branch February 17, 2022 23:09

dnephin mentioned this pull request Feb 18, 2022

rpc: set response to nil when not found #12389

Merged

This was referenced Feb 25, 2022

[1.11.x] backport of a series of blocking query readability and functionality improvements #12454

Merged

[1.10.x] backport of a series of blocking query readability and functionality improvements #12456

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rpc: make blocking queries for non-existent items more efficient #12110

rpc: make blocking queries for non-existent items more efficient #12110

dnephin commented Jan 17, 2022 •

edited

dnephin Jan 17, 2022

banks commented Jan 25, 2022 •

edited

dnephin commented Jan 25, 2022 •

edited

rboyer Feb 14, 2022

dnephin Feb 14, 2022

rboyer Feb 15, 2022

dnephin Feb 15, 2022 •

edited

dnephin Feb 15, 2022

dnephin left a comment

dnephin Feb 15, 2022 •

edited

rboyer left a comment

dnephin commented Feb 17, 2022

hc-github-team-consul-core commented Feb 17, 2022

rpc: make blocking queries for non-existent items more efficient #12110

rpc: make blocking queries for non-existent items more efficient #12110

Conversation

dnephin commented Jan 17, 2022 • edited

dnephin Jan 17, 2022

Choose a reason for hiding this comment

banks commented Jan 25, 2022 • edited

dnephin commented Jan 25, 2022 • edited

rboyer Feb 14, 2022

Choose a reason for hiding this comment

dnephin Feb 14, 2022

Choose a reason for hiding this comment

rboyer Feb 15, 2022

Choose a reason for hiding this comment

dnephin Feb 15, 2022 • edited

Choose a reason for hiding this comment

dnephin Feb 15, 2022

Choose a reason for hiding this comment

dnephin left a comment

Choose a reason for hiding this comment

dnephin Feb 15, 2022 • edited

Choose a reason for hiding this comment

rboyer left a comment

Choose a reason for hiding this comment

dnephin commented Feb 17, 2022

hc-github-team-consul-core commented Feb 17, 2022

dnephin commented Jan 17, 2022 •

edited

banks commented Jan 25, 2022 •

edited

dnephin commented Jan 25, 2022 •

edited

dnephin Feb 15, 2022 •

edited

dnephin Feb 15, 2022 •

edited