rpc: make blockingQuery easier to read #12109

dnephin · 2022-01-17T22:00:22Z

This is being done to make space for fixing #9449. Best viewed by individual commit. There is more detail in the commit messages.

The current implementation of blockingQueries is difficult to read, and difficult to modify, for a few reasons.

the goto. While it is safe to use goto in Go, most readers will not be used to following logic that uses goto. A for loop is not only much easier to reason about, it removes the need to pre-define a bunch of variables at the top of the function. Sometimes a goto is the right solution for complicated logic, but I believe this refactor shows it is not necessary here, and that a for loop reads better.
there were many long inline block comments. These comments are almost always better as godoc comments on functions (even if that function is unexported). A large block comment like this breaks the flow of someone trying to read the code. If the flow is going to be broken anyway, the code that the comment is documented may as well be in a separate function. That allows the main function to read much better without losing the flow. I left a couple of the shorter multi-line comments, but I do think we would benefit from reducing or removing those as well.
the block and exit logic was difficult to reason about because it was not using guard clauses for early returns. It relied on falling through to the end of the function. By inverting these conditions we can see the flow much better.

The getter methods on QueryOptionsCompat and QueryMetaCompat also make the code a little harder to read. It'd be great to use a struct field directly, or rename the interface methods to have them read a bit more naturally. Those renames are left for a future commit.

To remove the TODO, and make it more readable. In general this reduces the scope of variables, making them easier to reason about. It also introduces more early returns so that we can see the flow from the structure of the function.

This safeguard should be safe to apply in general. We are already applying it to non-blocking queries that call blockingQuery, so it should be fine to apply it to others.

This helps keep the logic in blockingQuery more focused. In the future we may have a separate struct for RPC queries which may allow us to move this off of Server.

Remove some unnecessary comments around query_blocking metric. The only line that needs any comments in the atomic decrement. Cleanup the block and return comments and logic. The old comment about AbandonCh may have been relevant before, but it is expected behaviour now. The logic was simplified by inverting the err condition.

dnephin · 2022-01-17T22:08:19Z

agent/consul/rpc.go

+	if m.GetIndex() < 1 {
+		m.SetIndex(1)
+	}


Despite the implication by the old comment, this was done for all queries that went through blockingQuery , even those with index=0 that were not actually blocking queries.

With this change, we continue to set it for all queries that go through blockingQuery, but we will now also set the default response index in a few queries that do their own thing (ex: the Txn operation). That's why I had to modify the expected value of a couple tests below.

kisunji · 2022-01-17T23:18:47Z

agent/consul/rpc.go

+		// block until something changes, or the timeout
+		if err := ws.WatchCtx(ctx); err != nil {
+			// exit if we've reached the timeout, or other cancellation
+			return nil


I think there's a difference in codepath: previously,blockingQuery would return any non-nil err whereas here we suppress it and return nil instead.

If that's intentional, do we want a more robust error check here? It looks like we know what errors to expect but if ws.WatchCtx ever changes to return other kinds of errors this might lead to unexpected behavior

I may be wrong, but I believe the old code would only ever return the err from fn(), which we still do here on line 970.

In the old code, the error from WatchCtx would only be scoped to that one block, and nothing in that block returned.

That's a good point about future compatibility though. I was also thinking of improving the godoc for WatchCtx so that it doesn't change functionality. I'd like to make the godoc for that method be more explicit about when an error can be returned (only if the context is cancelled), and what errors can be returned (only the context error).

I think if we did that, any change to errors returned by WatchCtx would be a breaking change. We could still guard against it here by checking the error, but I'm usually hesitant to add checks for conditions that can't currently exist.

Ah good point, I missed the shadowed err from WatchCtx. I poked at the WatchCtx definition and what you said makes good sense! 👍

kisunji

Looks much more readable and easier to reason about! Thanks for this change

hc-github-team-consul-core · 2022-01-26T23:14:26Z

🍒 If backport labels were added before merging, cherry-picking will start automatically.

To retroactively trigger a backport after merging, add backport labels and re-run https://circleci.com/gh/hashicorp/consul/567779.

dnephin added 5 commits January 17, 2022 16:58

rpc: refactor blocking query

f92dc11

To remove the TODO, and make it more readable. In general this reduces the scope of variables, making them easier to reason about. It also introduces more early returns so that we can see the flow from the structure of the function.

rpc: add subtests to blockingQuery test

4b67d6c

rpc: move the index defaulting to setQueryMeta.

fd0a9fd

This safeguard should be safe to apply in general. We are already applying it to non-blocking queries that call blockingQuery, so it should be fine to apply it to others.

rpc: extract rpcQueryTimeout method

72a733b

This helps keep the logic in blockingQuery more focused. In the future we may have a separate struct for RPC queries which may allow us to move this off of Server.

dnephin added theme/internal-cleanup Used to identify tech debt, testing improvements, code refactoring, and non-impactful optimization pr/no-changelog PR does not need a corresponding .changelog entry labels Jan 17, 2022

dnephin requested a review from a team January 17, 2022 22:00

dnephin added the pr/no-metrics-test label Jan 17, 2022

dnephin commented Jan 17, 2022

View reviewed changes

dnephin mentioned this pull request Jan 17, 2022

rpc: make blocking queries for non-existent items more efficient #12110

Merged

3 tasks

kisunji reviewed Jan 17, 2022

View reviewed changes

kisunji approved these changes Jan 18, 2022

View reviewed changes

kisunji requested a review from a team January 18, 2022 17:46

dnephin merged commit db04782 into main Jan 26, 2022

dnephin deleted the dnephin/blocking-query-1 branch January 26, 2022 23:13

dnephin mentioned this pull request Feb 14, 2022

rpc: improve docs for blockingQuery #12343

Merged

This was referenced Feb 25, 2022

[1.11.x] backport of a series of blocking query readability and functionality improvements #12454

Merged

[1.10.x] backport of a series of blocking query readability and functionality improvements #12456

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rpc: make blockingQuery easier to read #12109

rpc: make blockingQuery easier to read #12109

dnephin commented Jan 17, 2022 •

edited

dnephin Jan 17, 2022

kisunji Jan 17, 2022

dnephin Jan 18, 2022 •

edited

kisunji Jan 18, 2022

kisunji left a comment

hc-github-team-consul-core commented Jan 26, 2022

rpc: make blockingQuery easier to read #12109

rpc: make blockingQuery easier to read #12109

Conversation

dnephin commented Jan 17, 2022 • edited

dnephin Jan 17, 2022

Choose a reason for hiding this comment

kisunji Jan 17, 2022

Choose a reason for hiding this comment

dnephin Jan 18, 2022 • edited

Choose a reason for hiding this comment

kisunji Jan 18, 2022

Choose a reason for hiding this comment

kisunji left a comment

Choose a reason for hiding this comment

hc-github-team-consul-core commented Jan 26, 2022

dnephin commented Jan 17, 2022 •

edited

dnephin Jan 18, 2022 •

edited