-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rowexec: joinReader optimizations for index joins #52952
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @helenmhe, and @yuzefovich)
pkg/sql/rowexec/joinreader.go, line 460 at r1 (raw file):
// - For indexJoinReaderType this allows lower layers to optimize iteration // over the data. It's safe to sort since the looked up rows are output // unchanged.
this sort for index joins is causing test failures -- there must be a gap in my understanding on why we can't sort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @helenmhe, @sumeerbhola, and @yuzefovich)
pkg/sql/rowexec/joinreader.go, line 460 at r1 (raw file):
Previously, sumeerbhola wrote…
this sort for index joins is causing test failures -- there must be a gap in my understanding on why we can't sort.
When I refactored index joins to use the joinreader there wasn't any logic in the old indexjoiner.go
for sorting spans, so I didn't add any new logic in and based the index join strategy off of joinReaderNoOrderingStrategy
. Sorting isn't safe in the index join case because the order of the output is changed and then not restored during output collection. We might need to do something similar to joinReaderOrderingStrategy
to make sorting the spans correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @helenmhe, and @yuzefovich)
pkg/sql/rowexec/joinreader.go, line 460 at r1 (raw file):
Sorting isn't safe in the index join case because the order of the output is changed and then not restored during output collection.
Do you know why restoring is important given that we simply emit the looked up row? I was imagining we had one of two cases:
- the input would already be fully sorted by the key, so this sort of the spans in a batch would be a noop.
- the input was not sorted, so the next processor would not be expecting any ordering invariant, so sorting within a batch would be harmless.
What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @sumeerbhola, and @yuzefovich)
pkg/sql/rowexec/joinreader.go, line 460 at r1 (raw file):
Previously, sumeerbhola wrote…
Sorting isn't safe in the index join case because the order of the output is changed and then not restored during output collection.
Do you know why restoring is important given that we simply emit the looked up row? I was imagining we had one of two cases:
- the input would already be fully sorted by the key, so this sort of the spans in a batch would be a noop.
- the input was not sorted, so the next processor would not be expecting any ordering invariant, so sorting within a batch would be harmless.
What am I missing?
For me the tests that were failing were like TestSchemaChangeAfterCreateInTxn
where the test checks for a certain order but the query doesn't actually have an ORDER BY - when I asked it was suggested this was more of an issue with the test, it's possible sorting within the batch is still harmless I guess?
pkg/sql/rowexec/joinreader.go, line 460 at r1 (raw file): Previously, helenmhe (Helen He) wrote…
Hm never mind I took a look at the CI and saw the failing tests with ORDER BYs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @sumeerbhola, and @yuzefovich)
pkg/sql/rowexec/joinreader.go, line 460 at r1 (raw file):
Previously, helenmhe (Helen He) wrote…
Hm never mind I took a look at the CI and saw the failing tests with ORDER BYs
What's the MaintainOrdering
parameter on the spec for these test failures? It's possible that the lookup rows are sorted by secondary index and this is the order the tests expect to be maintained. This call to sort.Sort
will probably reorder spans by primary index, which might be a different order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @helenmhe, and @yuzefovich)
pkg/sql/rowexec/joinreader.go, line 460 at r1 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
What's the
MaintainOrdering
parameter on the spec for these test failures? It's possible that the lookup rows are sorted by secondary index and this is the order the tests expect to be maintained. This call tosort.Sort
will probably reorder spans by primary index, which might be a different order.
Yes, it looks like the failing ones are trying to maintain secondary index order.
I am trying to confirm that the cases I care about are not producing plans that MaintainOrdering
, but the info is not printed in the EXPLAIN
output, so I added some code here
Line 220 in c362707
v.observer.attr(name, "key columns", strings.Join(cols, ", ")) |
EXPLAIN
output. Where is the code that I should be changing instead?
48b2983
to
bc79176
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @helenmhe, and @yuzefovich)
pkg/sql/opt/exec/execbuilder/testdata/ddl, line 208 at r2 (raw file):
SELECT message FROM [SHOW KV TRACE FOR SESSION] WITH ORDINALITY WHERE message LIKE 'fetched:%' OR message LIKE 'output row%' ORDER BY message LIKE 'fetched:%' DESC, ordinality ASC
I want to confirm that changing the expected output is ok here. I don't really understand what ordering this is asking -- query T
suggests there is only 1 column, but this mentions ordinality
which doesn't seem to contribute to the output. Does the DESC of message apply to the whole column when it matches fetched:%
? If so, the reordering below seems wrong.
All the logictests are passing and the CI test failures seem unrelated.
pkg/sql/rowexec/joinreader.go, line 460 at r1 (raw file):
Previously, sumeerbhola wrote…
Yes, it looks like the failing ones are trying to maintain secondary index order.
I am trying to confirm that the cases I care about are not producing plans that
MaintainOrdering
, but the info is not printed in theEXPLAIN
output, so I added some code herewith a new "ordering" field name (this was prior to Radu's cleanup from yesterday of the old EXPLAIN code), but I don't see that information in theLine 220 in c362707
v.observer.attr(name, "key columns", strings.Join(cols, ", ")) EXPLAIN
output. Where is the code that I should be changing instead?
I've fixed this -- it needed a change to pass on the maintainOrdering
for the index join case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, 5 of 5 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @helenmhe, @rytaft, and @yuzefovich)
pkg/sql/opt/exec/execbuilder/testdata/ddl, line 208 at r2 (raw file):
Previously, sumeerbhola wrote…
I want to confirm that changing the expected output is ok here. I don't really understand what ordering this is asking --
query T
suggests there is only 1 column, but this mentionsordinality
which doesn't seem to contribute to the output. Does the DESC of message apply to the whole column when it matchesfetched:%
? If so, the reordering below seems wrong.All the logictests are passing and the CI test failures seem unrelated.
cc @rytaft on the expectations
pkg/sql/opt/exec/execbuilder/testdata/ddl, line 208 at r2 (raw file): Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Looks fine to me... I think the order by clause is just ordering on the boolean result of the |
bc79176
to
2b664b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto, @helenmhe, @rytaft, and @yuzefovich)
pkg/sql/opt/exec/execbuilder/testdata/ddl, line 208 at r2 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Looks fine to me... I think the order by clause is just ordering on the boolean result of the
LIKE
expression, ensuring that theoutput row
results come after thefetched
results
I see. Looking at this again, these are the rows from the preceding SELECT * FROM t@b_desc
, so it is explainable by the sort in the index join. If I change it to SELECT * FROM t@b_desc ORDER BY b DESC
the index join doesn't do the sort, as expected.
TFTRs! |
2b664b3
to
1f14d1a
Compare
These small optimizations are for index joins with large numbers of rows, which are common with geospatial queries. - No longer uses the keyToInputRowIndices map which was consuming about 1.5% in cpu profiles - Sorts the spans to use optimizations at the storage layer, like cockroachdb/pebble#860 - Avoids constructing the partial key string since it is not used Release note: None
1f14d1a
to
511d98d
Compare
bors r+ |
Build succeeded: |
These small optimizations are for index joins with large
numbers of rows, which are common with geospatial queries.
was consuming about 1.5% in cpu profiles
like sstable: optimize seeks to use next pebble#860
not used
Release note: None