-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mango fields pushdown #4394
Mango fields pushdown #4394
Conversation
391556b
to
b1f6b71
Compare
b1f6b71
to
d5c31e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
A very nice optimization!
(See a minor grammar nit and a question about a stronger match)
ac02d44
to
6b73584
Compare
I needed to understand the format of arguments to `match/2` when writing the code to support projecting fields on the shard, so I wrote some code to figure it out as a test. I figure this may be useful for future work in this area, so push as commit.
This commit aims to improve Mango by reducing the data transferred to the coordinator during query execution. It may reduce memory or CPU use at the coordinator but that isn't the primary goal. Currently, when documents are read at the shard level, they are compared locally at the shard with the selector to ensure they match before they are sent to the coordinator. This ensures we're not sending documents across the network that the coordinator immediately discards, saving bandwidth and coordinator processing. This commit further executes field projection (`fields` in the query) at the shard level. This should further save bandwidth, particularly for queries that project few fields from large documents. One item of complexity is that a query may request a quorum read of documents, meaning that we need to do the document read at the coordinator and not the shard, then perform the `selector` and `fields` processing there rather than at the shard. To ensure that documents are processed consistently whether at the shard or coordinator, match_and_extract_doc/3 is added. There is still one orphan call outside match_and_extract_doc/2 to extract/2 which supports cluster upgrade and should later be removed. Shard level processing is already performed in a callback, view_cb/2, that's passed to fabric's view processing to run for each row in the view result set. It's used for the shard local selector and fields processing. To make it clear what arguments are destined for this callback, the commit encapsulates the arguments, using viewcbargs_new/2 and viewcbargs_get/2. As we push down more functionality to the shard, the context this function needs to carry with it will increase, so having a record for it will be valuable. Supporting cluster upgrades: The commit supports shard pushdown for Mango `fields` processing for situations during rolling cluster upgrades. In the state where the coordinator is speaking to an upgraded node, the view_cb/2 needs to support being passed just the `selector` outside of the new viewcbargs record. In this case, the shard will not process fields, but the coordinator will. In the situation where the coordinator is upgraded but the shard is not, we need to send the selector to the shard via `selector` and also execute the fields projection at the coordinator. Therefore we pass arguments to view_cb/2 via both `selector` and `callback_args` and have an apparently spurious field projection (mango_fields:extract/2) in the code that receives back values from the shard ( factored out into doc_member_and_extract). Both of these affordances should only need to exist through one minor version change and be removed thereafter -- if people are jumping several minor versions of CouchDB in one go, hopefully they are prepared for a bit of trouble. Testing upgrade states: As view_cb is completely separate from the rest of the cursor code, we can first try out the branch's code using view_cb from `main`, and then the other way -- the branch's view_cb with the rest of the file from main. I did both of these tests successfully.
6b73584
to
b2642c0
Compare
% This supports receiving our "arguments" either as just the `selector` | ||
% or in the new record in `callback_args`. This is to support mid-upgrade | ||
% clusters where the non-upgraded coordinator nodes will send the older style. | ||
% TODO remove this in a couple of couchdb versions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth codifying this a little. I know we have other places, but maybe e can start here:
maybe we can make a comment like % x-couch-remove: 4.0.0
that we can then grep for prior to releasing 4.0.0.
The reasoning here that we can say: upgrade everything to the latest 3.x version and THEN go to 4.0.0 will be smooth, rather than supporting any 3.x-> 4.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable. My question is: does this removal need to wait for 4.x, given it's not a breaking change, or does it need t wait a couple of minor - or patch - releases?
I don't know how many people do rolling upgrades outside of Cloudant, as opposed to just stopping the world for a few minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think enough folks do this and not necessarily within a small round of point releases that would warrant keeping it up until 4.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I’m fine with making this change in a follow-up PR where we go through all our rolling update compat code and merge this as-is #ScopeCreep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been informally tagging those with "backwards compatibility" or "upgrade clause" notes in the comments. Searching for "TODO" would work as well. Some official marker would be better, of course.
Another way is to add a config parameter like we had for rexi use_kill_all but there is a balance there.
But I am inclined to merge it as is and do the extra formalizing as another task
Overview
This PR aims to improve Mango by reducing the data transferred to
the coordinator during query execution. It may reduce memory or CPU use
at the coordinator but that isn't the primary goal.
Currently, when documents are read at the shard level, they are compared
locally at the shard with the selector to ensure they match before they
are sent to the coordinator. This ensures we're not sending documents
across the network that the coordinator immediately discards, saving
bandwidth and coordinator processing. This PR further executes field
projection (
fields
in the query) at the shard level. This shouldfurther save bandwidth, particularly for queries that project few fields
from large documents.
One item of complexity is that a query may request a quorum read of
documents, meaning that we need to do the document read at the
coordinator and not the shard, then perform the
selector
andfields
processing there rather than at the shard. To ensure that documents are
processed consistently whether at the shard or coordinator,
match_and_extract_doc/3 is added. There is still one orphan call outside
match_and_extract_doc/2 to extract/2 which supports cluster upgrade and
should later be removed.
Shard level processing is already performed in a callback, view_cb/2,
that's passed to fabric's view processing to run for each row in the
view result set. It's used for the shard local selector and fields
processing. To make it clear what arguments are destined for this
callback, the PR encapsulates the arguments, using viewcbargs_new/2
and viewcbargs_get/2.
As we push down more functionality to the shard, the context this
function needs to carry with it will increase, so having a record for it
will be valuable.
Supporting cluster upgrades:
The PR supports shard pushdown for Mango
fields
processing forsituations during rolling cluster upgrades. (Cloudant require this
as they use rolling upgrades).
In the state where the coordinator is speaking to an upgraded node, the
view_cb/2 needs to support being passed just the
selector
outside ofthe new viewcbargs record. In this case, the shard will not process
fields, but the coordinator will.
In the situation where the coordinator is upgraded but the shard is not,
we need to send the selector to the shard via
selector
and alsoexecute the fields projection at the coordinator. Therefore we pass
arguments to view_cb/2 via both
selector
andcallback_args
and havean apparently spurious field projection (mango_fields:extract/2) in the
code that receives back values from the shard ( factored out into
doc_member_and_extract).
Both of these affordances should only need to exist through one minor
version change and be removed thereafter -- if people are jumping
several minor versions of CouchDB in one go, hopefully they are prepared
for a bit of trouble.
Testing upgrade states:
As view_cb is completely separate from the rest of the cursor code,
we can first try out the branch's code using view_cb from
main
, andthen the other way -- the branch's view_cb with the rest of the file
from main. I did both of these tests successfully.
Testing recommendations
This PR should not change anything from an end user perspective. Mango responses should remain the same as they currently are.
I have run some basic performance locally tests using k6.io, which showed no meaningful change in the latency of requests.
Related Issues or Pull Requests
none.
Checklist
rel/overlay/etc/default.ini
src/docs
folder