Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mango fields pushdown #4394

Merged
merged 2 commits into from
Jan 20, 2023
Merged

Mango fields pushdown #4394

merged 2 commits into from
Jan 20, 2023

Conversation

mikerhodes
Copy link
Contributor

Overview

This PR aims to improve Mango by reducing the data transferred to
the coordinator during query execution. It may reduce memory or CPU use
at the coordinator but that isn't the primary goal.

Currently, when documents are read at the shard level, they are compared
locally at the shard with the selector to ensure they match before they
are sent to the coordinator. This ensures we're not sending documents
across the network that the coordinator immediately discards, saving
bandwidth and coordinator processing. This PR further executes field
projection (fields in the query) at the shard level. This should
further save bandwidth, particularly for queries that project few fields
from large documents.

One item of complexity is that a query may request a quorum read of
documents, meaning that we need to do the document read at the
coordinator and not the shard, then perform the selector and fields
processing there rather than at the shard. To ensure that documents are
processed consistently whether at the shard or coordinator,
match_and_extract_doc/3 is added. There is still one orphan call outside
match_and_extract_doc/2 to extract/2 which supports cluster upgrade and
should later be removed.

Shard level processing is already performed in a callback, view_cb/2,
that's passed to fabric's view processing to run for each row in the
view result set. It's used for the shard local selector and fields
processing. To make it clear what arguments are destined for this
callback, the PR encapsulates the arguments, using viewcbargs_new/2
and viewcbargs_get/2.

As we push down more functionality to the shard, the context this
function needs to carry with it will increase, so having a record for it
will be valuable.

Supporting cluster upgrades:

The PR supports shard pushdown for Mango fields processing for
situations during rolling cluster upgrades. (Cloudant require this
as they use rolling upgrades).

In the state where the coordinator is speaking to an upgraded node, the
view_cb/2 needs to support being passed just the selector outside of
the new viewcbargs record. In this case, the shard will not process
fields, but the coordinator will.

In the situation where the coordinator is upgraded but the shard is not,
we need to send the selector to the shard via selector and also
execute the fields projection at the coordinator. Therefore we pass
arguments to view_cb/2 via both selector and callback_args and have
an apparently spurious field projection (mango_fields:extract/2) in the
code that receives back values from the shard ( factored out into
doc_member_and_extract).

Both of these affordances should only need to exist through one minor
version change and be removed thereafter -- if people are jumping
several minor versions of CouchDB in one go, hopefully they are prepared
for a bit of trouble.

Testing upgrade states:

As view_cb is completely separate from the rest of the cursor code,
we can first try out the branch's code using view_cb from main, and
then the other way -- the branch's view_cb with the rest of the file
from main. I did both of these tests successfully.

Testing recommendations

This PR should not change anything from an end user perspective. Mango responses should remain the same as they currently are.

I have run some basic performance locally tests using k6.io, which showed no meaningful change in the latency of requests.

Related Issues or Pull Requests

none.

Checklist

  • Code is written and works correctly
  • Changes are covered by tests
  • Any new configurable parameters are documented in rel/overlay/etc/default.ini
  • Documentation changes were made in the src/docs folder
  • Documentation changes were backported (separated PR) to affected branches

Copy link
Contributor

@nickva nickva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

A very nice optimization!

(See a minor grammar nit and a question about a stronger match)

@mikerhodes mikerhodes force-pushed the mango-fields-pushdown branch 2 times, most recently from ac02d44 to 6b73584 Compare January 20, 2023 09:56
I needed to understand the format of arguments to `match/2` when writing
the code to support projecting fields on the shard, so I wrote some code
to figure it out as a test. I figure this may be useful for future
work in this area, so push as commit.
This commit aims to improve Mango by reducing the data transferred to
the coordinator during query execution. It may reduce memory or CPU use
at the coordinator but that isn't the primary goal.

Currently, when documents are read at the shard level, they are compared
locally at the shard with the selector to ensure they match before they
are sent to the coordinator. This ensures we're not sending documents
across the network that the coordinator immediately discards, saving
bandwidth and coordinator processing. This commit further executes field
projection (`fields` in the query) at the shard level. This should
further save bandwidth, particularly for queries that project few fields
from large documents.

One item of complexity is that a query may request a quorum read of
documents, meaning that we need to do the document read at the
coordinator and not the shard, then perform the `selector` and `fields`
processing there rather than at the shard. To ensure that documents are
processed consistently whether at the shard or coordinator,
match_and_extract_doc/3 is added. There is still one orphan call outside
match_and_extract_doc/2 to extract/2 which supports cluster upgrade and
should later be removed.

Shard level processing is already performed in a callback, view_cb/2,
that's passed to fabric's view processing to run for each row in the
view result set. It's used for the shard local selector and fields
processing. To make it clear what arguments are destined for this
callback, the commit encapsulates the arguments, using viewcbargs_new/2
and viewcbargs_get/2.

As we push down more functionality to the shard, the context this
function needs to carry with it will increase, so having a record for it
will be valuable.

Supporting cluster upgrades:

The commit supports shard pushdown for Mango `fields` processing for
situations during rolling cluster upgrades.

In the state where the coordinator is speaking to an upgraded node, the
view_cb/2 needs to support being passed just the `selector` outside of
the new viewcbargs record. In this case, the shard will not process
fields, but the coordinator will.

In the situation where the coordinator is upgraded but the shard is not,
we need to send the selector to the shard via `selector` and also
execute the fields projection at the coordinator. Therefore we pass
arguments to view_cb/2 via both `selector` and `callback_args` and have
an apparently spurious field projection (mango_fields:extract/2) in the
code that receives back values from the shard ( factored out into
doc_member_and_extract).

Both of these affordances should only need to exist through one minor
version change and be removed thereafter -- if people are jumping
several minor versions of CouchDB in one go, hopefully they are prepared
for a bit of trouble.

Testing upgrade states:

As view_cb is completely separate from the rest of the cursor code,
we can first try out the branch's code using view_cb from `main`, and
then the other way -- the branch's view_cb with the rest of the file
from main. I did both of these tests successfully.
% This supports receiving our "arguments" either as just the `selector`
% or in the new record in `callback_args`. This is to support mid-upgrade
% clusters where the non-upgraded coordinator nodes will send the older style.
% TODO remove this in a couple of couchdb versions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth codifying this a little. I know we have other places, but maybe e can start here:

maybe we can make a comment like % x-couch-remove: 4.0.0 that we can then grep for prior to releasing 4.0.0.

The reasoning here that we can say: upgrade everything to the latest 3.x version and THEN go to 4.0.0 will be smooth, rather than supporting any 3.x-> 4.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable. My question is: does this removal need to wait for 4.x, given it's not a breaking change, or does it need t wait a couple of minor - or patch - releases?

I don't know how many people do rolling upgrades outside of Cloudant, as opposed to just stopping the world for a few minutes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think enough folks do this and not necessarily within a small round of point releases that would warrant keeping it up until 4.0.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I’m fine with making this change in a follow-up PR where we go through all our rolling update compat code and merge this as-is #ScopeCreep

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Sounds good.

Copy link
Contributor

@nickva nickva Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been informally tagging those with "backwards compatibility" or "upgrade clause" notes in the comments. Searching for "TODO" would work as well. Some official marker would be better, of course.

Another way is to add a config parameter like we had for rexi use_kill_all but there is a balance there.

But I am inclined to merge it as is and do the extra formalizing as another task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants