-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving search by compound indexes. #3915
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution!
For the following table:
CREATE TABLE TEST(A INTEGER, B INTEGER, C INTEGER) AS
SELECT A, B, C FROM (VALUES 1, 2) T1(A) JOIN (VALUES 3, 4) T2(B) JOIN (VALUES 5, 6) T3(C);
Index conditions aren't listed in execution plan any more, so something is wrong with them:
CREATE INDEX TEST_A_IDX ON TEST(A);
EXPLAIN SELECT * FROM TEST WHERE (A, B) IN ((1, 3), (2, 4));
was: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_A_IDX: A IN(1, 2) */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))
now: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_A_IDX */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))
The same problem here:
DROP INDEX TEST_A_IDX;
CREATE INDEX TEST_B_IDX ON TEST(B);
EXPLAIN SELECT * FROM TEST WHERE (A, B) IN ((1, 3), (2, 4));
was: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_B_IDX: B IN(3, 4) */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))
now: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_B_IDX */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))
And here
CREATE INDEX TEST_C_A_IDX ON TEST(C, A);
EXPLAIN SELECT * FROM TEST WHERE (A, B) IN ((1, 3), (2, 4));
was: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_B_IDX: B IN(3, 4) */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))
now: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_B_IDX */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))
Here they are listed with a wrong order of values, it must be (5, 1), (6, 2)):
EXPLAIN SELECT * FROM TEST WHERE (A, C) IN ((1, 5), (2, 6));
was: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_C_A_IDX: A IN(1, 2) AND C IN(5, 6) */
WHERE ROW ("A", "C") IN(ROW (1, 5), ROW (2, 6))
now: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_C_A_IDX: IN(ROW (1, 5), ROW (2, 6)) */
WHERE ROW ("A", "C") IN(ROW (1, 5), ROW (2, 6))
There problems need to be resolved and I think we need more tests for these conditions.
* Contains a {@link Column} or {@code Column[]} depending on the condition type. | ||
* @see #isCompoundColumns() | ||
*/ | ||
private final Object column; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be better to use different fields here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
*/ | ||
private IndexCondition(int compareType, ExpressionList columns, Expression expression) { | ||
this.compareType = compareType; | ||
if (columns == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, use this code style:
if (…) {
…
} else {
…
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
* | ||
* @see Column#convert(CastDataProvider, Value) | ||
*/ | ||
public ValueRow convert(CastDataProvider provider, Column[] columns) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to move this code to some other place. Value classes may not depend on Column
and other database objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a static method in the Column
class instead.
return column; | ||
if (column instanceof Column) | ||
return (Column) column; | ||
throw new IllegalStateException("The getColumn() method cannot be with multiple columns."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something goes horribly wrong, use throw DbException.getInternalError()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -189,6 +209,26 @@ private boolean canUseIndexFor(Column column) { | |||
return idxCol == null || idxCol.column == column; | |||
} | |||
|
|||
private boolean canUseIndexForIn(Column[] columns) { | |||
if ( inColumn != null ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(inColumn != null)
(without spaces inside parentheses).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
It must support them in absence of a better index, because we can't accept a PR with this regression. Actually your implementation still somehow chooses a proper index, but |
Hi @katzyn, Thank you for your feedback. I put back the previous I adjusted some tests to make them pass the new expectations. I need to fix the regular compound comparisons too (e.g: Please let me know if you have any suggestions. Thanks. |
Hi @katzyn, I hope I was able to solve every issue. I have run the tests, and it looks OK to me. Could you please review my pull request again? Thanks. |
ExpressionVisitor visitor = ExpressionVisitor.getNotFromResolverVisitor(filter); | ||
for (Expression e : valueList) { | ||
if (!e.isEverything(visitor)) | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!e.isEverything(visitor)) {
return;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ExpressionVisitor visitor = ExpressionVisitor.getNotFromResolverVisitor(filter); | ||
for (Expression e : valueList) { | ||
if (!e.isEverything(visitor)) | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
builder.append(" IN("); | ||
for (int i = 0, s = expressionList.size(); i < s; i++) { | ||
if (i > 0) | ||
builder.append(", "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Column[] columns = getColumns(); | ||
for (int i = columns.length; --i >= 0; ) { | ||
if (TableType.TABLE != columns[i].getTable().getTableType()) | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
if (!isCompoundColumns()) { | ||
builder.append("column=").append(column); | ||
} else { | ||
builder.append("columns=").append(columns); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Column[].toString()
doesn't return anything useful here, use Column.writeColumns(builder, columns, TRACE_SQL_FLAGS)
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
for (int i = 0; i < cols.length; i++) { | ||
IndexColumn idxCol = cols[i]; | ||
if (idxCol != null && idxCol.column != columns[i]) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
h2/src/main/org/h2/table/Column.java
Outdated
copy[i] = nv; | ||
} | ||
} | ||
return copy == null ? valueRow : ValueRow.get(valueRow.getType(), copy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess valueRow.getType()
can return wrong data type information here, so TypeInfo.getTypeInfo(Value.ROW, 0, 0, new ExtTypeInfoRow(column))
should be used instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Column col = columns[j]; | ||
indexed = col.getColumnId() >= 0 && index.getColumnIndex(col) >= 0; | ||
if (!indexed) | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
if (col.getColumnId() >= 0) { | ||
int columnIndex = index.getColumnIndex(col); | ||
if (columnIndex == 0) // The first column of the index always matches. | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -4424,6 +4424,9 @@ SELECT * FROM TEST2COL WHERE A=0 AND B=0; | |||
EXPLAIN SELECT * FROM TEST2COL WHERE A=0 AND B=0; | |||
>> SELECT "PUBLIC"."TEST2COL"."A", "PUBLIC"."TEST2COL"."B", "PUBLIC"."TEST2COL"."C" FROM "PUBLIC"."TEST2COL" /* PUBLIC.PRIMARY_KEY_E: A = 0 AND B = 0 */ WHERE ("A" = 0) AND ("B" = 0) | |||
|
|||
EXPLAIN SELECT * FROM TEST2COL WHERE (A, B)=(0, 0); | |||
>> SELECT "PUBLIC"."TEST2COL"."A", "PUBLIC"."TEST2COL"."B", "PUBLIC"."TEST2COL"."C" FROM "PUBLIC"."TEST2COL" /* PUBLIC.PRIMARY_KEY_E: A = 0 AND B = 0 */ WHERE ROW ("A", "B") = ROW (0, 0) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, don't add any new tests to this legacy script. You can add them to indexes.sql
instead.
I think we need more tests including multi-column joins with other tables with both EXPLAIN
and actual test of query results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I deleted the new lines.
I created a new test class: TestCompoundIndexSearch
Hi @katzyn, I fixed your previous findings. Could you please check the pull request again? Thanks. |
Something is wrong with execution plan in the following example: CREATE TABLE TEST(A INTEGER, B INTEGER, C INTEGER) AS
SELECT A, B, C FROM (VALUES 1, 2) T1(A) JOIN (VALUES 3, 4) T2(B) JOIN (VALUES 5, 6) T3(C);
CREATE INDEX TEST_C_A_IDX ON TEST(C, A);
EXPLAIN ANALYZE SELECT * FROM TEST WHERE (A, C) IN ((1, 5), (2, 6)); SELECT
"PUBLIC"."TEST"."A",
"PUBLIC"."TEST"."B",
"PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST"
/* PUBLIC.TEST_C_A_IDX: IN(ROW (1, 5), ROW (2, 6))
AND C IN(5, 6)
*/
/* scanCount: 9 */
WHERE ROW ("A", "C") IN(ROW (1, 5), ROW (2, 6)) Correct plan should look like that: /* PUBLIC.TEST_C_A_IDX: IN(ROW (5, 1), ROW (6, 2))
*/ Columns need to be specified in correct order and
|
…sed in the right order.
Hi @katzyn, I fixed the index condition error although, I am not sure this is what you wanted to see. If the query uses the indexed columns in a wrong order, I re-create the index condition with a correct column order. It looks a bit ugly to me, but it seems to work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, send a license statement as described here
https://h2database.com/html/build.html#providing_patches
to our mailing list (Google group)
https://groups.google.com/g/h2-database
(This group is partially pre-moderated, your post may not appear immediately.)
* Improving search by compound indexes. * Fixing review findings. * Fixing the index condition handling. * Adjusting test cases. * Fixing TableFilter.prepare(). * Adjusting test cases. * Refactoring the constructors of the IndexCondition class. * Removing unnecessary TODOs. * Fixing review findings. * Introducing the TestCompoundIndexSearch class. * Preparing IndexCondition to deal with queries where the columns not used in the right order. * Fixing test errors.
Hi,
As I can see, searching by compound indexes does not work properly.
I created table with some data inside to test the query execution. See: https://github.com/kiss034/h2database/blob/test/sql/company.sql
I tested with the following SQL:
Before my changes, the engine was able to find the matching compound index while preparing, but later it was deleted from the index conditions, so a full table scan was executed.
I prepared the IndexCondition and IndexCursor classes to deal with multiple columns. After these changes the indexed search was executed properly.
As I can see, the previous implementation used only the first indexed component of the compound IN condition. I do not know whether it was intentional. My implementation does not support single indexes in such queries.
Furthermore, currently the following test are fail.
Could you please let me know if my implementation or the test should be fixed? Thanks.
Best regards
János Áron Kiss