Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: produce constant columns in predicate from partial index scan #62406

Merged
merged 2 commits into from
Mar 26, 2021

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Mar 23, 2021

opt: refactor indexScanBuilder

This commit refactors indexScanBuilder to make it safer to use and
maintain.

  • Functions that are intended for use outside of the struct's
    implementation have been capitalized to make this more apparent.
  • Helper functions that determine the state of the builder have been
    added to simplify the internal logic of the builder.
  • primaryKeyCols was removed because it is unrelated to building
    scans.

Release note: None

opt: produce constant columns in predicate from partial index scan

GenerateConstrainedScans and GeneratePartialIndexScans now build
Project expressions to produce columns that are held constant in partial
index predicates when possible. This allows a partial index to cover
columns and avoid IndexJoins in more cases.

A column held constant in a partial index predicate is only projected
when:

  1. The index does not include the column.
  2. All columns can be covered because they are either in the index or
    held constant by the predicate. In other words, a Project is only
    built when doing so eliminates an IndexJoin.

Resolves #60532

Release note (performance improvement): Columns that are held constant
in partial index predicates can now be produced when scanning the
partial index. This eliminates unnecessary primary index joins to
retrieve those constant columns in some queries, resulting in lower
latency.

@mgartner mgartner requested a review from a team as a code owner March 23, 2021 00:10
@cockroach-teamcity
Copy link
Member

This change is Reviewable

if val == tree.DNull {
// NULL values should always be a memo.NullExpr, not a
// memo.ConstExpr.
scalar = memo.NullSingleton
Copy link
Collaborator Author

@mgartner mgartner Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure if it's safe to project an untyped NULL here, but I did some simply ad-hoc testing and it seemed fine.

@@ -17,7 +17,7 @@ CREATE TABLE a (
# Eliminate the IndexJoin when the Project passthrough columns are a subset of
# the IndexJoin's input columns.
opt expect=EliminateIndexJoinInsideProject
SELECT k, i FROM a WHERE s = 'foo'
SELECT k, i FROM a WHERE s LIKE 'foo%'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to change the predicate so that the index join wasn't replace with a projection of the constant foo.

@mgartner mgartner force-pushed the project-const-partial-index-pred branch from 13788b1 to 7a503d2 Compare March 23, 2021 01:17
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)


pkg/sql/opt/xform/scan_index_iter.go, line 358 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I'm not 100% sure if it's safe to project an untyped NULL here, but I did some simply ad-hoc testing and it seemed fine.

This is what the optbuilder does when it builds IS NOT NULL, it should be fine.


pkg/sql/opt/xform/scan_index_iter.go, line 286 at r3 (raw file):

		var constProj memo.ProjectionsExpr
		if !isCovering && len(predFilters) > 0 {
			constCols := memo.ExtractConstColumns(predFilters, it.evalCtx)

We have to be careful to not replace composite types. For example, the table might contain 1.00 and the predicate is x = 1.0; we want to return 1.00 not 1.0. This is a lame example, but it would be more egregious with collated strings.

@mgartner mgartner force-pushed the project-const-partial-index-pred branch from 7a503d2 to 15f3812 Compare March 24, 2021 23:21
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rytaft)


pkg/sql/opt/xform/scan_index_iter.go, line 286 at r3 (raw file):

Previously, RaduBerinde wrote…

We have to be careful to not replace composite types. For example, the table might contain 1.00 and the predicate is x = 1.0; we want to return 1.00 not 1.0. This is a lame example, but it would be more egregious with collated strings.

Great catch! Fixed.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

[nit] in the commit/PR message: "when doing eliminates" -> "when doing so eliminates"

Reviewed 4 of 4 files at r1, 7 of 9 files at r2, 1 of 1 files at r3, 2 of 2 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner)


pkg/sql/opt/xform/index_scan_builder.go, line 52 at r1 (raw file):

}

func (b *indexScanBuilder) init(c *CustomFuncs, tabID opt.TableID) {

should this be capitalized?


pkg/sql/opt/xform/index_scan_builder.go, line 73 at r1 (raw file):

}

// hasInnerFilters returns true if inverted filters have been added to the

hasInnerFilters -> hasInvertedFilter


pkg/sql/opt/xform/index_scan_builder.go, line 79 at r1 (raw file):

}

// hasInnerFilters returns true if an index join has been added to the builder.

hasInnerFilters -> hasIndexJoin


pkg/sql/opt/xform/index_scan_builder.go, line 106 at r1 (raw file):

	if spanExpr != nil {
		if b.hasInvertedFilter() {
			panic(errors.AssertionFailedf("cannot call addInvertedFilter twice"))

addInvertedFilter -> AddInvertedFilter


pkg/sql/opt/xform/index_scan_builder.go, line 125 at r1 (raw file):

		if !b.hasIndexJoin() {
			if b.hasInnerFilters() {
				panic(errors.AssertionFailedf("cannot call addSelect methods twice before index join is added"))

addSelect -> AddSelect


pkg/sql/opt/xform/index_scan_builder.go, line 130 at r1 (raw file):

		} else {
			if b.hasOuterFilters() {
				panic(errors.AssertionFailedf("cannot call addSelect methods twice after index join is added"))

ditto


pkg/sql/opt/xform/index_scan_builder.go, line 109 at r4 (raw file):

	if len(proj) != 0 {
		if b.hasConstProjections() {
			panic(errors.AssertionFailedf("cannot call addConstProjections twice"))

addConstProjections -> AddConstProjections


pkg/sql/opt/xform/index_scan_builder.go, line 112 at r4 (raw file):

		}
		if b.hasInnerFilters() || b.hasOuterFilters() {
			panic(errors.AssertionFailedf("cannot call addConstProjections after filters are added"))

ditto


pkg/sql/opt/xform/index_scan_builder.go, line 229 at r4 (raw file):

	}

	// 3. Wrap scan in inner filter if it was added.

[nit] scan -> input


pkg/sql/opt/xform/testdata/rules/select, line 5351 at r4 (raw file):

      │    └── constraint: /3/1: [/20 - /20]
      └── projections
           └── 'foo'

can we avoid adding the projection in this case?

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)


pkg/sql/opt/xform/index_scan_builder.go, line 217 at r4 (raw file):

	// 2. Wrap input in a Project if constant projections were added.
	if b.hasConstProjections() {
		if !b.hasInnerFilters() && !b.hasInvertedFilter() && !b.hasIndexJoin() {

Are we missing a !hasOuterFilters? (or we cannot have outer filters if we don't have inner filters?)

More of a meta-comment but each step verifying the conditions for the next steps is error-prone. Maybe it would be better to first look at all conditions and determine which is the last step (literally we could have a lastStep int).

This commit refactors `indexScanBuilder` to make it safer to use and
maintain.

  - Functions that are intended for use outside of the struct's
    implementation have been capitalized to make this more apparent.
  - Helper functions that determine the state of the builder have been
    added to simplify the internal logic of the builder.
  - `primaryKeyCols` was removed because it is unrelated to building
    scans.

Release note: None
@mgartner mgartner force-pushed the project-const-partial-index-pred branch from 15f3812 to 3951dcb Compare March 26, 2021 00:52
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] in the commit/PR message: "when doing eliminates" -> "when doing so eliminates"

Done.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde and @rytaft)


pkg/sql/opt/xform/index_scan_builder.go, line 52 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

should this be capitalized?

Done.


pkg/sql/opt/xform/index_scan_builder.go, line 73 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

hasInnerFilters -> hasInvertedFilter

Done.


pkg/sql/opt/xform/index_scan_builder.go, line 79 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

hasInnerFilters -> hasIndexJoin

Done.


pkg/sql/opt/xform/index_scan_builder.go, line 106 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

addInvertedFilter -> AddInvertedFilter

Done.


pkg/sql/opt/xform/index_scan_builder.go, line 125 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

addSelect -> AddSelect

Done.


pkg/sql/opt/xform/index_scan_builder.go, line 130 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

ditto

Done.


pkg/sql/opt/xform/index_scan_builder.go, line 109 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

addConstProjections -> AddConstProjections

Done.


pkg/sql/opt/xform/index_scan_builder.go, line 112 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

ditto

Done.


pkg/sql/opt/xform/index_scan_builder.go, line 217 at r4 (raw file):

Previously, RaduBerinde wrote…

Are we missing a !hasOuterFilters? (or we cannot have outer filters if we don't have inner filters?)

More of a meta-comment but each step verifying the conditions for the next steps is error-prone. Maybe it would be better to first look at all conditions and determine which is the last step (literally we could have a lastStep int).

We cannot have outer filter if we don't have an index join. Outer filters are filters that can't be applied until after we've retrieved the missing columns with an index join.

I agree this is error-prone and I don't love it. But I'm not convinced the lastStep suggestion would solve this though - if n is the last step, not all steps 1..n will necessarily happen. For example, it's possible to have just an index join and outer filters, which would execute steps 1, 5, and 6.


pkg/sql/opt/xform/index_scan_builder.go, line 229 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] scan -> input

Done.


pkg/sql/opt/xform/testdata/rules/select, line 5351 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

can we avoid adding the projection in this case?

This is another one of those cases where the plan isn't ideal because normalization rules aren't applied above the exploration. Previously, the EliminateIndexJoinInsideProject exploration rule removed an unnecessary index join:

================================================================================
GenerateConstrainedScans
  Cost: 10.52
================================================================================
   project
    ├── columns: k:1!null
    ├── key: (1)
    └── index-join zz_partial
         ├── columns: k:1!null j:3!null s:6!null
         ├── key: (1)
         ├── fd: ()-->(3,6)
  -      └── select
  +      └── scan zz_partial@j,partial
              ├── columns: k:1!null j:3!null
  +           ├── constraint: /3/1: [/20 - /20]
              ├── key: (1)
  -           ├── fd: ()-->(3)
  -           ├── scan zz_partial@j,partial
  -           │    ├── columns: k:1!null j:3
  -           │    ├── key: (1)
  -           │    └── fd: (1)-->(3)
  -           └── filters
  -                └── j:3 = 20 [outer=(3), constraints=(/3: [/20 - /20]; tight), fd=()-->(3)]
  +           └── fd: ()-->(3)
================================================================================
EliminateIndexJoinInsideProject
  Cost: 4.98
================================================================================
   project
    ├── columns: k:1!null
    ├── key: (1)
  - └── index-join zz_partial
  -      ├── columns: k:1!null j:3!null s:6!null
  + └── scan zz_partial@j,partial
  +      ├── columns: k:1!null j:3!null
  +      ├── constraint: /3/1: [/20 - /20]
         ├── key: (1)
  -      ├── fd: ()-->(3,6)
  -      └── scan zz_partial@j,partial
  -           ├── columns: k:1!null j:3!null
  -           ├── constraint: /3/1: [/20 - /20]
  -           ├── key: (1)
  -           └── fd: ()-->(3)
  +      └── fd: ()-->(3)
================================================================================
Final best expression
  Cost: 4.98
================================================================================
  project
   ├── columns: k:1!null
   ├── key: (1)
   └── scan zz_partial@j,partial
        ├── columns: k:1!null j:3!null
        ├── constraint: /3/1: [/20 - /20]
        ├── key: (1)
        └── fd: ()-->(3)

Any expressions in the matched Select's group must output the columns referenced in the Select's filter, so this extra exploration rule was required. We'd have to add a similar exploration rule which eliminates these types of unnecessary Projects. It seems less urgent to me than the index join case, since an extra Project is much less of a price to pay than an extra index join. But if you think such an exploration rule should be added, I'm happy to do so in a follow-up PR.

I've added a TODO to this test to note that it's not optimal.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 10 of 10 files at r5, 10 of 10 files at r6.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @RaduBerinde)


pkg/sql/opt/xform/index_scan_builder.go, line 229 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Done.

looks like you didn't change this (not that it's very important, but in case you meant to make the change...)


pkg/sql/opt/xform/index_scan_builder.go, line 52 at r5 (raw file):

}

// Init initialized an indexScanBuilder.

nit: initialized -> initializes


pkg/sql/opt/xform/testdata/rules/select, line 5351 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

This is another one of those cases where the plan isn't ideal because normalization rules aren't applied above the exploration. Previously, the EliminateIndexJoinInsideProject exploration rule removed an unnecessary index join:

================================================================================
GenerateConstrainedScans
  Cost: 10.52
================================================================================
   project
    ├── columns: k:1!null
    ├── key: (1)
    └── index-join zz_partial
         ├── columns: k:1!null j:3!null s:6!null
         ├── key: (1)
         ├── fd: ()-->(3,6)
  -      └── select
  +      └── scan zz_partial@j,partial
              ├── columns: k:1!null j:3!null
  +           ├── constraint: /3/1: [/20 - /20]
              ├── key: (1)
  -           ├── fd: ()-->(3)
  -           ├── scan zz_partial@j,partial
  -           │    ├── columns: k:1!null j:3
  -           │    ├── key: (1)
  -           │    └── fd: (1)-->(3)
  -           └── filters
  -                └── j:3 = 20 [outer=(3), constraints=(/3: [/20 - /20]; tight), fd=()-->(3)]
  +           └── fd: ()-->(3)
================================================================================
EliminateIndexJoinInsideProject
  Cost: 4.98
================================================================================
   project
    ├── columns: k:1!null
    ├── key: (1)
  - └── index-join zz_partial
  -      ├── columns: k:1!null j:3!null s:6!null
  + └── scan zz_partial@j,partial
  +      ├── columns: k:1!null j:3!null
  +      ├── constraint: /3/1: [/20 - /20]
         ├── key: (1)
  -      ├── fd: ()-->(3,6)
  -      └── scan zz_partial@j,partial
  -           ├── columns: k:1!null j:3!null
  -           ├── constraint: /3/1: [/20 - /20]
  -           ├── key: (1)
  -           └── fd: ()-->(3)
  +      └── fd: ()-->(3)
================================================================================
Final best expression
  Cost: 4.98
================================================================================
  project
   ├── columns: k:1!null
   ├── key: (1)
   └── scan zz_partial@j,partial
        ├── columns: k:1!null j:3!null
        ├── constraint: /3/1: [/20 - /20]
        ├── key: (1)
        └── fd: ()-->(3)

Any expressions in the matched Select's group must output the columns referenced in the Select's filter, so this extra exploration rule was required. We'd have to add a similar exploration rule which eliminates these types of unnecessary Projects. It seems less urgent to me than the index join case, since an extra Project is much less of a price to pay than an extra index join. But if you think such an exploration rule should be added, I'm happy to do so in a follow-up PR.

I've added a TODO to this test to note that it's not optimal.

Agree it's not urgent -- a TODO seems fine for now

`GenerateConstrainedScans` and `GeneratePartialIndexScans` now build
Project expressions to produce columns that are held constant in partial
index predicates when possible. This allows a partial index to cover
columns and avoid IndexJoins in more cases.

A column held constant in a partial index predicate is only projected
when:

  1. The index does not include the column.
  2. All columns can be covered because they are either in the index or
     held constant by the predicate. In other words, a Project is only
     built when so doing eliminates an IndexJoin.

Resolves cockroachdb#60532

Release note (performance improvement): Columns that are held constant
in partial index predicates can now be produced when scanning the
partial index. This eliminates unnecessary primary index joins to
retrieve those constant columns in some queries, resulting in lower
latency.
@mgartner mgartner force-pushed the project-const-partial-index-pred branch from 3951dcb to 3bb6aa1 Compare March 26, 2021 18:40
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @RaduBerinde and @rytaft)


pkg/sql/opt/xform/index_scan_builder.go, line 229 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

looks like you didn't change this (not that it's very important, but in case you meant to make the change...)

good catch, must have gotta confused during some rebasing... fixed


pkg/sql/opt/xform/index_scan_builder.go, line 52 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: initialized -> initializes

Done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @RaduBerinde)

@mgartner
Copy link
Collaborator Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 26, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: produce constant predicate columns from partial index scan
4 participants