Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upsql: use ClearRange for index truncation #31326
Conversation
eriktrinh
requested a review
from
vivekmenezes
Oct 12, 2018
eriktrinh
requested review from
cockroachdb/core-prs
as
code owners
Oct 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
vivekmenezes
requested changes
Oct 12, 2018
Nicely done!
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/lease_test.go, line 1025 at r1 (raw file):
t.Fatal(err) }
I think it's valuable to count the size of the index and assert that the above INSERT added a value to the index even though it was dropped.
pkg/sql/tablewriter_delete.go, line 261 at r1 (raw file):
autoCommit autoCommitOpt, traceKV bool, allowClearRange bool,
We should create a new function clearIndex() and use that instead of passing in a bool. In clearRange() you can assert that !idx.IsInterleaved
eriktrinh
reviewed
Oct 15, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/lease_test.go, line 1025 at r1 (raw file):
Previously, vivekmenezes wrote…
I think it's valuable to count the size of the index and assert that the above INSERT added a value to the index even though it was dropped.
Done.
pkg/sql/tablewriter_delete.go, line 261 at r1 (raw file):
Previously, vivekmenezes wrote…
We should create a new function clearIndex() and use that instead of passing in a bool. In clearRange() you can assert that !idx.IsInterleaved
Done.
vivekmenezes
requested changes
Oct 15, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/tablewriter_delete.go, line 306 at r2 (raw file):
) (roachpb.Span, error) { if idx.IsInterleaved() { if log.V(2) {
you can return an error here. There should be no reason why IsInterleaved() is true
pkg/storage/batcheval/cmd_clear_range.go, line 95 at r2 (raw file):
if err := stateLoader.SetGCThreshold(ctx, batch, cArgs.Stats, &gcThreshold); err != nil { return result.Result{}, err }
We should have a test in storage that tests this. I'm adding @benesch as a reviewer
vivekmenezes
requested a review
from
benesch
Oct 15, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
looks like this has a broken test in Import |
eriktrinh
reviewed
Oct 15, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/tablewriter_delete.go, line 306 at r2 (raw file):
Previously, vivekmenezes wrote…
you can return an error here. There should be no reason why IsInterleaved() is true
It can be true because truncateIndexes (which calls clearIndex) is still called in the synchronous code path for interleaved indexes. In fact it will also call this for non-interleaved indexes in the synchronous code path if the cluster version does not allow for ClearRange (on a related note, do we need to add a new cluster version flag for truncating indexes?).
We should check canClearRangeForDrop once again in truncateIndexes and call this if it can, and deleteIndex otherwise.
vivekmenezes
requested changes
Oct 15, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/tablewriter_delete.go, line 306 at r2 (raw file):
Previously, eriktrinh (Erik Trinh) wrote…
It can be true because
truncateIndexes(which callsclearIndex) is still called in the synchronous code path for interleaved indexes. In fact it will also call this for non-interleaved indexes in the synchronous code path if the cluster version does not allow for ClearRange (on a related note, do we need to add a new cluster version flag for truncating indexes?).We should check
canClearRangeForDroponce again intruncateIndexesand call this if it can, anddeleteIndexotherwise.
Let's maintain two separate code paths for the deleteIndex and clearIndex and return an error from clearIndex if it shouldn't have been called. Thanks!
eriktrinh
reviewed
Oct 15, 2018
looks like this has a broken test in Import
looks like there is a race, still taking a look
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/tablewriter_delete.go, line 306 at r2 (raw file):
Previously, vivekmenezes wrote…
Let's maintain two separate code paths for the deleteIndex and clearIndex and return an error from clearIndex if it shouldn't have been called. Thanks!
Done.
benesch
reviewed
Oct 15, 2018
Reviewed 2 of 3 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained
pkg/storage/batcheval/cmd_clear_range.go, line 95 at r2 (raw file):
Previously, vivekmenezes wrote…
We should have a test in storage that tests this. I'm adding @benesch as a reviewer
I think the implications of this are problematic. Say the index is located on the same range as the primary key, which is the case for any table whose primary + index data is less than 64MB. Dropping that index would forward the GC threshold on that entire range to now, meaning that AOST queries on the primary key data will be impossible!
In fact, even if you assume that the index is on a separate range, I'm not sure this is sufficient? Don't AOST queries see the old version of the descriptor? So they'll try to use the index and explode because the index's range won't serve reads at that time.
Anyway, @tschottdorf is the expert here. I might be off base.
tschottdorf
reviewed
Oct 15, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/storage/batcheval/cmd_clear_range.go, line 95 at r2 (raw file):
Previously, benesch (Nikhil Benesch) wrote…
I think the implications of this are problematic. Say the index is located on the same range as the primary key, which is the case for any table whose primary + index data is less than 64MB. Dropping that index would forward the GC threshold on that entire range to now, meaning that AOST queries on the primary key data will be impossible!
In fact, even if you assume that the index is on a separate range, I'm not sure this is sufficient? Don't AOST queries see the old version of the descriptor? So they'll try to use the index and explode because the index's range won't serve reads at that time.
Anyway, @tschottdorf is the expert here. I might be off base.
Yeah, we wouldn't want to wipe the AOST window here. I'm probably lacking context here, but why do we need to do this? I was assuming the schema change machinery would delete this data only once it's not being accessed any more.
benesch
reviewed
Oct 15, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/storage/batcheval/cmd_clear_range.go, line 95 at r2 (raw file):
I think you actually answered that question a year ago. :)
One complication is that all index operations are serialized through the surrounding table descriptor, so that the naive approach of waiting for the TTL is not great (as it holds up unrelated schema changes).
eriktrinh
reviewed
Oct 17, 2018
PTAL, updated the pr to move ahead with this despite the ClearRange limitation. The failing test from before was probably the race between the split queue and forwarding the GC threshold.
Reviewable status:
complete! 0 of 0 LGTMs obtained
eriktrinh commentedOct 12, 2018
•
edited
This change makes the deletion of index data use the ClearRange batch
request. DROP INDEX schema changes in the same transaction as the one
which created the table still uses the slower DelRange because
ClearRange cannot be run inside a transaction and will remove write
intents of the index keys which have not been resolved in the
transaction.
This deletion of index data happens once the GC TTL period has passed
and it is safe to remove all the data. See PR #30566 for which delays
this data deletion.
The GC threshold is also forwarded for the cleared range to prevent
reads and writes of index data.
Fixes #20696.
Release note (performance improvement): Deletion of index data is faster.