-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fetcher: don't hold batches across boundaries #66376
Conversation
fb753d6
to
ae5b218
Compare
This PR was failing tests that uncovered an inaccuracy in the way that interleaved "seek prefixes" were previously computed. Added a first commit that fixes that problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, 3 of 5 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)
pkg/sql/row/fetcher.go, line 719 at r2 (raw file):
gcassert:inline
Interesting. Do we run gcassert
as part of our build?
pkg/sql/row/kv_batch_fetcher.go, line 544 at r2 (raw file):
if len(f.remainingBatches) > 0 { batchResp, f.remainingBatches = popBatch(f.remainingBatches) atBatchBoundary = len(f.remainingBatches) == 0 && !f.fetchEnd
len(f.remainingBatches) == 0
doesn't seem to be a sufficient condition for this fetcher to have to issue a new BatchRequest
-- there may still be responses left in f.responses
.
The code in this file could use some more comments. It is doing something quite straightforward but it took me a while to figure out that the results from a BatchResponse
are being extracted in two levels -- first in each response and then the BatchResponses
in each response.
And the origSpan
is actually staying the same for multiple []byte
that share the same BatchResponses
. Hopefully the higher-layer code doesn't interpret the span as something that is fully satisfied by a single batch it has been given. Speaking of that, I don't see any write to f.origSpan
-- where is it updated? Do we have a bug here when a scan uses BATCH_RESPONSE
instead of KEY_VALUES
?
pkg/sql/row/kv_batch_fetcher.go, line 559 at r2 (raw file):
batchResp, f.remainingBatches = popBatch(t.BatchResponses) } atBatchBoundary = len(f.responses) == 0 && len(f.remainingBatches) == 0 && !f.fetchEnd
how about a separate function with this condition -- I am assuming it will get inlined.
pkg/sql/row/kv_fetcher.go, line 105 at r2 (raw file):
f.newSpan = false nKvs := len(f.kvs) if nKvs != 0 {
Can you add a comment that f.kvs is not used when f.batchResponse is non-nil and vice versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find! I have some nits, but I'll defer to Sumeer for approval.
Reviewed 2 of 2 files at r1, 5 of 5 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @sumeerbhola)
pkg/sql/row/fetcher.go, line 55 at r2 (raw file):
// version - both must be handled by calling code. nextBatch(ctx context.Context) (ok bool, kvs []roachpb.KeyValue, batchResponse []byte, origSpan roachpb.Span, moreKeys bool, err error)
nit: s/moreKeys/atBatchBoundary/
, right?
pkg/sql/row/fetcher.go, line 719 at r2 (raw file):
Previously, sumeerbhola wrote…
gcassert:inline
Interesting. Do we run
gcassert
as part of our build?
It is TestGCAssert
linter.
Jordan, you'll need to include sql/row
package to run the GCAssert against. Or maybe we should switch to opt-in by default for the linter? Will it be too slow?
pkg/sql/row/fetcher.go, line 722 at r2 (raw file):
// gcassert:inline func (rf *Fetcher) setNextKV(kv roachpb.KeyValue, needsCopy bool) { if needsCopy {
super nit: I like having symmetry if possible, maybe refactor one of setNextKV
methods to have the structure of the other (e.g. having if !needsCopy
check here and return early).
pkg/sql/row/kv_batch_fetcher.go, line 559 at r2 (raw file):
Previously, sumeerbhola wrote…
how about a separate function with this condition -- I am assuming it will get inlined.
+1
Previously, there was a bug in the decoding methods that returned the expected "seek prefix" for interleaved table when encountering a row from an unwanted table. The returned seek prefix was not as precise as it should have been: instead of returning the first key of the desired table, it returned a key somewhere in the middle of the found table. This did not cause any user-visible issues, since the cfetcher code is resilient to finding an incorrect interleaved table row. But, it caused interleaved table fetches to be less efficient than necessary. This commit also adds explicit capacities to the returned keys and values from the KV Fetcher, to prevent issues where callers accidentally slice past the length of a large-capacity backing slice that backs the returned kvs. Release note: None
ec081dc
to
138d77e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, PTAL. Sorry for the long delay, thanks for the reviews. I think I was able to simplify this by a decent amount.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)
pkg/sql/row/fetcher.go, line 719 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It is
TestGCAssert
linter.Jordan, you'll need to include
sql/row
package to run the GCAssert against. Or maybe we should switch to opt-in by default for the linter? Will it be too slow?
Done.
Yeah I think it would be too slow to turn on for all packages.
pkg/sql/row/fetcher.go, line 722 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
super nit: I like having symmetry if possible, maybe refactor one of
setNextKV
methods to have the structure of the other (e.g. havingif !needsCopy
check here and return early).
Done.
pkg/sql/row/kv_batch_fetcher.go, line 544 at r2 (raw file):
Previously, sumeerbhola wrote…
len(f.remainingBatches) == 0
doesn't seem to be a sufficient condition for this fetcher to have to issue a newBatchRequest
-- there may still be responses left inf.responses
.The code in this file could use some more comments. It is doing something quite straightforward but it took me a while to figure out that the results from a
BatchResponse
are being extracted in two levels -- first in each response and then theBatchResponses
in each response.
And theorigSpan
is actually staying the same for multiple[]byte
that share the sameBatchResponses
. Hopefully the higher-layer code doesn't interpret the span as something that is fully satisfied by a single batch it has been given. Speaking of that, I don't see any write tof.origSpan
-- where is it updated? Do we have a bug here when a scan usesBATCH_RESPONSE
instead ofKEY_VALUES
?
Thinking this through some more, I actually always want this function to be returning true, so I've removed the parameter. The purpose of this exercise was to allow higher-level code to know, given a returned roachpb.KeyValue
, whether it was the final reference to a larger backing byte slice - indicating that the next call to NextKV
would potentially ask for another larger backing byte slice before the caller had a chance to clear its references to the previous larger backing byte slice.
This is always true for every byte slice batch - since for every batch, the only remaining pointers to that batch once its finished is a returned KV. So we can handle this logic completely in the layer that slices up BatchResponses into KVs one by one (the kv_fetcher.go code).
I added some comments about what is actually going on in this loop. I agree it's confusing, the recursion doesn't help although I'm not sure the best way to structure the loop without the recursive call at the end.
You're right about f.origSpan: we don't seem to update it properly. It actually turns out that nobody uses this return value - I deleted it altogether because it's confusing and broken to your point.
pkg/sql/row/kv_batch_fetcher.go, line 559 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
+1
I was able to delete this altogether.
pkg/sql/row/kv_fetcher.go, line 105 at r2 (raw file):
Previously, sumeerbhola wrote…
Can you add a comment that f.kvs is not used when f.batchResponse is non-nil and vice versa.
Done.
Previously, the cfetcher was completely blind to whether or not the keys it retrieves from the kvfetcher lie on a batch boundary. This led to the cfetcher retaining a pointer to 2 fetched batches at once during fetches, since it doesn't deep copy key or value memory out of the batches. The retained pointer is required, because the cfetcher needs to keep the bytes of the last key and value it saw for various reasons. Now, the kvfetcher and kvbatchfetcher participate by returning whether or not the most recent returned key or batch lies on a *batch boundary*, meaning that the next invocation to the fetcher would require sending a BatchRequest to KV, getting more bytes back. If this condition is true, the cFetcher will now deep copy its last-seen KV instead of keeping a pointer into the most recent batch. This improves memory usage by 2x for a brief moment, since before, the cFetcher would always have a live pointer to both the last batch and the next batch until it had a chance to decode the first key of the next batch. Commit 5c00812 improved this same situation by drastically cutting down the window in which the last batch was retained. But it wasn't quite enough. After this commit, there should be no moment in time when the fetcher ever retains a pointer to more than one batch at the same time. Release note (sql change): reduce instantaneous memory usage during scans by up to 2x.
This commit reduces the number of interface calls in the hot path for fetching data by 1 per column per row. Release note: None
138d77e
to
e42fe76
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 5 files at r3, 3 of 6 files at r6, 3 of 3 files at r8, 1 of 1 files at r9.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)
TFTR! bors r+ |
(this is just a test, not really backporting) blathers backport 21.1 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 99a96f1 to blathers/backport-release-21.1-66376: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.1 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Build succeeded: |
Updates #64906.
Previously, the cfetcher was completely blind to whether or not the keys
it retrieves from the kvfetcher lie on a batch boundary. This led to
the cfetcher retaining a pointer to 2 fetched batches at once during
fetches, since it doesn't deep copy key or value memory out of the
batches. The retained pointer is required, because the cfetcher needs to
keep the bytes of the last key and value it saw for various reasons.
Now, the kvfetcher and kvbatchfetcher participate by returning whether
or not the most recent returned key or batch lies on a batch boundary,
meaning that the next invocation to the fetcher would require sending a
BatchRequest to KV, getting more bytes back.
If this condition is true, the cFetcher will now deep copy its last-seen
KV instead of keeping a pointer into the most recent batch.
This improves memory usage by 2x for a brief moment, since before, the
cFetcher would always have a live pointer to both the last batch and the
next batch until it had a chance to decode the first key of the next
batch.
Commit 5c00812 improved this same
situation by drastically cutting down the window in which the last batch
was retained. But it wasn't quite enough. After this commit, there
should be no moment in time when the fetcher ever retains a pointer to
more than one batch at the same time.
Release note (sql change): reduce instantaneous memory usage during
scans by up to 2x.