Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetcher: don't hold batches across boundaries #66376

Merged
merged 3 commits into from
Jul 9, 2021

Conversation

jordanlewis
Copy link
Member

Updates #64906.

Previously, the cfetcher was completely blind to whether or not the keys
it retrieves from the kvfetcher lie on a batch boundary. This led to
the cfetcher retaining a pointer to 2 fetched batches at once during
fetches, since it doesn't deep copy key or value memory out of the
batches. The retained pointer is required, because the cfetcher needs to
keep the bytes of the last key and value it saw for various reasons.

Now, the kvfetcher and kvbatchfetcher participate by returning whether
or not the most recent returned key or batch lies on a batch boundary,
meaning that the next invocation to the fetcher would require sending a
BatchRequest to KV, getting more bytes back.

If this condition is true, the cFetcher will now deep copy its last-seen
KV instead of keeping a pointer into the most recent batch.

This improves memory usage by 2x for a brief moment, since before, the
cFetcher would always have a live pointer to both the last batch and the
next batch until it had a chance to decode the first key of the next
batch.

Commit 5c00812 improved this same
situation by drastically cutting down the window in which the last batch
was retained. But it wasn't quite enough. After this commit, there
should be no moment in time when the fetcher ever retains a pointer to
more than one batch at the same time.

Release note (sql change): reduce instantaneous memory usage during
scans by up to 2x.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jordanlewis
Copy link
Member Author

This PR was failing tests that uncovered an inaccuracy in the way that interleaved "seek prefixes" were previously computed. Added a first commit that fixes that problem.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1, 3 of 5 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)


pkg/sql/row/fetcher.go, line 719 at r2 (raw file):

gcassert:inline

Interesting. Do we run gcassert as part of our build?


pkg/sql/row/kv_batch_fetcher.go, line 544 at r2 (raw file):

	if len(f.remainingBatches) > 0 {
		batchResp, f.remainingBatches = popBatch(f.remainingBatches)
		atBatchBoundary = len(f.remainingBatches) == 0 && !f.fetchEnd

len(f.remainingBatches) == 0 doesn't seem to be a sufficient condition for this fetcher to have to issue a new BatchRequest -- there may still be responses left in f.responses.

The code in this file could use some more comments. It is doing something quite straightforward but it took me a while to figure out that the results from a BatchResponse are being extracted in two levels -- first in each response and then the BatchResponses in each response.
And the origSpan is actually staying the same for multiple []byte that share the same BatchResponses. Hopefully the higher-layer code doesn't interpret the span as something that is fully satisfied by a single batch it has been given. Speaking of that, I don't see any write to f.origSpan -- where is it updated? Do we have a bug here when a scan uses BATCH_RESPONSE instead of KEY_VALUES?


pkg/sql/row/kv_batch_fetcher.go, line 559 at r2 (raw file):

				batchResp, f.remainingBatches = popBatch(t.BatchResponses)
			}
			atBatchBoundary = len(f.responses) == 0 && len(f.remainingBatches) == 0 && !f.fetchEnd

how about a separate function with this condition -- I am assuming it will get inlined.


pkg/sql/row/kv_fetcher.go, line 105 at r2 (raw file):

		f.newSpan = false
		nKvs := len(f.kvs)
		if nKvs != 0 {

Can you add a comment that f.kvs is not used when f.batchResponse is non-nil and vice versa.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! I have some nits, but I'll defer to Sumeer for approval.

Reviewed 2 of 2 files at r1, 5 of 5 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @sumeerbhola)


pkg/sql/row/fetcher.go, line 55 at r2 (raw file):

	// version - both must be handled by calling code.
	nextBatch(ctx context.Context) (ok bool, kvs []roachpb.KeyValue,
		batchResponse []byte, origSpan roachpb.Span, moreKeys bool, err error)

nit: s/moreKeys/atBatchBoundary/, right?


pkg/sql/row/fetcher.go, line 719 at r2 (raw file):

Previously, sumeerbhola wrote…
gcassert:inline

Interesting. Do we run gcassert as part of our build?

It is TestGCAssert linter.

Jordan, you'll need to include sql/row package to run the GCAssert against. Or maybe we should switch to opt-in by default for the linter? Will it be too slow?


pkg/sql/row/fetcher.go, line 722 at r2 (raw file):

// gcassert:inline
func (rf *Fetcher) setNextKV(kv roachpb.KeyValue, needsCopy bool) {
	if needsCopy {

super nit: I like having symmetry if possible, maybe refactor one of setNextKV methods to have the structure of the other (e.g. having if !needsCopy check here and return early).


pkg/sql/row/kv_batch_fetcher.go, line 559 at r2 (raw file):

Previously, sumeerbhola wrote…

how about a separate function with this condition -- I am assuming it will get inlined.

+1

Previously, there was a bug in the decoding methods that returned the
expected "seek prefix" for interleaved table when encountering a row
from an unwanted table. The returned seek prefix was not as precise as
it should have been: instead of returning the first key of the desired
table, it returned a key somewhere in the middle of the found table.

This did not cause any user-visible issues, since the cfetcher code is
resilient to finding an incorrect interleaved table row. But, it caused
interleaved table fetches to be less efficient than necessary.

This commit also adds explicit capacities to the returned keys and
values from the KV Fetcher, to prevent issues where callers accidentally
slice past the length of a large-capacity backing slice that backs the
returned kvs.

Release note: None
@jordanlewis jordanlewis force-pushed the cfetcher-boundary-copy branch 2 times, most recently from ec081dc to 138d77e Compare July 2, 2021 20:50
Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, PTAL. Sorry for the long delay, thanks for the reviews. I think I was able to simplify this by a decent amount.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)


pkg/sql/row/fetcher.go, line 719 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

It is TestGCAssert linter.

Jordan, you'll need to include sql/row package to run the GCAssert against. Or maybe we should switch to opt-in by default for the linter? Will it be too slow?

Done.

Yeah I think it would be too slow to turn on for all packages.


pkg/sql/row/fetcher.go, line 722 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: I like having symmetry if possible, maybe refactor one of setNextKV methods to have the structure of the other (e.g. having if !needsCopy check here and return early).

Done.


pkg/sql/row/kv_batch_fetcher.go, line 544 at r2 (raw file):

Previously, sumeerbhola wrote…

len(f.remainingBatches) == 0 doesn't seem to be a sufficient condition for this fetcher to have to issue a new BatchRequest -- there may still be responses left in f.responses.

The code in this file could use some more comments. It is doing something quite straightforward but it took me a while to figure out that the results from a BatchResponse are being extracted in two levels -- first in each response and then the BatchResponses in each response.
And the origSpan is actually staying the same for multiple []byte that share the same BatchResponses. Hopefully the higher-layer code doesn't interpret the span as something that is fully satisfied by a single batch it has been given. Speaking of that, I don't see any write to f.origSpan -- where is it updated? Do we have a bug here when a scan uses BATCH_RESPONSE instead of KEY_VALUES?

Thinking this through some more, I actually always want this function to be returning true, so I've removed the parameter. The purpose of this exercise was to allow higher-level code to know, given a returned roachpb.KeyValue, whether it was the final reference to a larger backing byte slice - indicating that the next call to NextKV would potentially ask for another larger backing byte slice before the caller had a chance to clear its references to the previous larger backing byte slice.

This is always true for every byte slice batch - since for every batch, the only remaining pointers to that batch once its finished is a returned KV. So we can handle this logic completely in the layer that slices up BatchResponses into KVs one by one (the kv_fetcher.go code).

I added some comments about what is actually going on in this loop. I agree it's confusing, the recursion doesn't help although I'm not sure the best way to structure the loop without the recursive call at the end.

You're right about f.origSpan: we don't seem to update it properly. It actually turns out that nobody uses this return value - I deleted it altogether because it's confusing and broken to your point.


pkg/sql/row/kv_batch_fetcher.go, line 559 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

+1

I was able to delete this altogether.


pkg/sql/row/kv_fetcher.go, line 105 at r2 (raw file):

Previously, sumeerbhola wrote…

Can you add a comment that f.kvs is not used when f.batchResponse is non-nil and vice versa.

Done.

Previously, the cfetcher was completely blind to whether or not the keys
it retrieves from the kvfetcher lie on a batch boundary. This led to
the cfetcher retaining a pointer to 2 fetched batches at once during
fetches, since it doesn't deep copy key or value memory out of the
batches. The retained pointer is required, because the cfetcher needs to
keep the bytes of the last key and value it saw for various reasons.

Now, the kvfetcher and kvbatchfetcher participate by returning whether
or not the most recent returned key or batch lies on a *batch boundary*,
meaning that the next invocation to the fetcher would require sending a
BatchRequest to KV, getting more bytes back.

If this condition is true, the cFetcher will now deep copy its last-seen
KV instead of keeping a pointer into the most recent batch.

This improves memory usage by 2x for a brief moment, since before, the
cFetcher would always have a live pointer to both the last batch and the
next batch until it had a chance to decode the first key of the next
batch.

Commit 5c00812 improved this same
situation by drastically cutting down the window in which the last batch
was retained. But it wasn't quite enough. After this commit, there
should be no moment in time when the fetcher ever retains a pointer to
more than one batch at the same time.

Release note (sql change): reduce instantaneous memory usage during
scans by up to 2x.
This commit reduces the number of interface calls in the hot path for
fetching data by 1 per column per row.

Release note: None
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 5 files at r3, 3 of 6 files at r6, 3 of 3 files at r8, 1 of 1 files at r9.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)

@jordanlewis
Copy link
Member Author

TFTR!

bors r+

@jordanlewis
Copy link
Member Author

(this is just a test, not really backporting)

blathers backport 21.1

@blathers-crl
Copy link

blathers-crl bot commented Jul 9, 2021

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 99a96f1 to blathers/backport-release-21.1-66376: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.1 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@craig
Copy link
Contributor

craig bot commented Jul 9, 2021

Build succeeded:

@craig craig bot merged commit 8f65a25 into cockroachdb:master Jul 9, 2021
@jordanlewis jordanlewis deleted the cfetcher-boundary-copy branch July 9, 2021 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants