Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: Prune Update and Upsert input columns #34522

Merged
merged 4 commits into from
Feb 5, 2019

Conversation

andy-kimball
Copy link
Contributor

Prune input columns that are not needed by Update and Upsert operators.
Needed columns include returned columns and columns needed to formulate
KV Put and Delete operations to implement the mutations. All other
columns can be pruned.

This commit contains new code to enable the Upsert fast path with the
CBO. The fast path uses the KV Put method to blindly insert new rows
or overwrite existing rows. Because any existing data is ignored, it is
not necessary to fetch existing rows, as is normally required. This
also allows a much simpler logical plan.

After this change, the optimizer updates/upserts no longer need to stay
behind a feature flag, as all fast paths should now work at least as
well as they do with the heuristic planner.

Release note (sql change): Delete, Update, and Upsert statements are
now planned by the cost-based optimizer.

@andy-kimball andy-kimball requested a review from a team as a code owner February 3, 2019 08:22
@andy-kimball andy-kimball requested review from a team February 3, 2019 08:22
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 16 of 16 files at r1, 16 of 16 files at r2, 28 of 28 files at r3, 12 of 12 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @justinj, and @RaduBerinde)


pkg/sql/opt_catalog.go, line 364 at r1 (raw file):

	stats []optTableStat

	// family is the inlined wrapper for the table's primary family, where

this comment got cut off.


pkg/sql/opt_catalog.go, line 367 at r1 (raw file):

	primaryFamily optFamily

	// families is a cache of family wrappers (all except default family) that's

is default family the same as primaryFamily?


pkg/sql/opt_exec_factory.go, line 1077 at r3 (raw file):

	var fkCheckType row.FKCheckType
	if len(updateColDescs) == 0 {
		fkCheckType = row.CheckInserts

Is there no longer a possibility that we only need to check inserts?


pkg/sql/opt/cat/family.go, line 1 at r1 (raw file):

// Copyright 2018 The Cockroach Authors.

[nit] 2019


pkg/sql/opt/cat/table.go, line 128 at r1 (raw file):

	// FamilyCount returns the number of column families present on the table.
	// There is always at least one primary family (always family 0) where columns
	// go if they are not explicitly assigned to another family.

What if all columns are explicitly assigned? Is there still a primary family?


pkg/sql/opt/optbuilder/mutation_builder.go, line 413 at r2 (raw file):

	if needResults {
		// Only non-mutation columns are output columns.
		private.ReturnCols = make(opt.ColList, mb.tab.DeletableColumnCount())

shouldn't this be ColumnCount()?


pkg/sql/opt/testutils/testcat/create_table.go, line 72 at r1 (raw file):

	}

	// Add non-mutation columns and families.

I don't see where the families are getting added here.

Copy link
Contributor Author

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @justinj, @RaduBerinde, and @rytaft)


pkg/sql/opt_catalog.go, line 364 at r1 (raw file):

Previously, rytaft wrote…

this comment got cut off.

Done.


pkg/sql/opt_catalog.go, line 367 at r1 (raw file):

Previously, rytaft wrote…

is default family the same as primaryFamily?

Done.


pkg/sql/opt_exec_factory.go, line 1077 at r3 (raw file):

Previously, rytaft wrote…

Is there no longer a possibility that we only need to check inserts?

No, there shouldn't be any such possibility. We map that case into an Insert.


pkg/sql/opt/cat/family.go, line 1 at r1 (raw file):

Previously, rytaft wrote…

[nit] 2019

Done.


pkg/sql/opt/cat/table.go, line 128 at r1 (raw file):

Previously, rytaft wrote…

What if all columns are explicitly assigned? Is there still a primary family?

The primary family is always the first family that was explicitly defined, or if no families were defined, then a primary family is synthesized. I added to the comment.


pkg/sql/opt/optbuilder/mutation_builder.go, line 413 at r2 (raw file):

Previously, rytaft wrote…

shouldn't this be ColumnCount()?

No, the length of these column lists is always DeletableColumnCount, but missing columns are set to 0. So in this case the length is DeletableColumnCount, but we only fill in the first ColumnCount entries. I added to the comment.


pkg/sql/opt/testutils/testcat/create_table.go, line 72 at r1 (raw file):

Previously, rytaft wrote…

I don't see where the families are getting added here.

Done.

Add Family collection to cat.Table. This information will be used by the
Update column pruning code.

Release note: None
Previously, the list of RETURNING columns was inferred for mutation
operators. However, in order to make pruning of input columns easier, and
to prepare for a future when ReturnCols can be pruned, this commit makes
the list explicit.

For now, the new ReturnCols list is never pruned. Additional execution
support is required for that.

Release note: None
Prune input columns that are not needed by Update and Upsert operators.
Needed columns include returned columns and columns needed to formulate
KV Put and Delete operations to implement the mutations. All other
columns can be pruned.

This commit contains new code to enable the Upsert fast path with the
CBO. The fast path uses the KV Put method to blindly insert new rows
or overwrite existing rows. Because any existing data is ignored, it is
not necessary to fetch existing rows, as is normally required. This
also allows a much simpler logical plan.

After this change, the optimizer updates/upserts no longer need to stay
behind a feature flag, as all fast paths should now work at least as
well as they do with the heuristic planner.

Release note (sql change): Delete, Update, and Upsert statements are
now planned by the cost-based optimizer.
@andy-kimball
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 5, 2019

Merge conflict (retrying...)

craig bot pushed a commit that referenced this pull request Feb 5, 2019
34522: opt: Prune Update and Upsert input columns r=andy-kimball a=andy-kimball

Prune input columns that are not needed by Update and Upsert operators.
Needed columns include returned columns and columns needed to formulate
KV Put and Delete operations to implement the mutations. All other
columns can be pruned.

This commit contains new code to enable the Upsert fast path with the
CBO. The fast path uses the KV Put method to blindly insert new rows
or overwrite existing rows. Because any existing data is ignored, it is
not necessary to fetch existing rows, as is normally required. This
also allows a much simpler logical plan.

After this change, the optimizer updates/upserts no longer need to stay
behind a feature flag, as all fast paths should now work at least as
well as they do with the heuristic planner.

Release note (sql change): Delete, Update, and Upsert statements are
now planned by the cost-based optimizer.

34529: sql: enable automatic statistics by default r=rytaft a=rytaft

This commit changes the default value of the cluster setting
`sql.stats.experimental_automatic_collection.enabled` to true.
As a result, automatic statistics collection is now enabled by
default. It can still be disabled by setting
`sql.stats.experimental_automatic_collection.enabled=false`.

Release note (sql change): Enabled automatic statistics collection.

Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com>
Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Feb 5, 2019

Build succeeded

@craig craig bot merged commit cabe058 into cockroachdb:master Feb 5, 2019
@andy-kimball andy-kimball deleted the update branch February 5, 2019 02:06
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 7, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page to add:

  - Add FILTER per cockroachdb/cockroach#34077

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 7, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page as follows:

  - Add FILTER per cockroachdb/cockroach#34077

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

  - Remove `experimental_optimizer_updates` cluster setting (can't find
    a commit for this, but I don't see it in `SHOW ALL` output on my
    local build of yesterday's `master`, version number is
    `v2.2.0-alpha.20181217-1096-gd104dcee69-dirty`.

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 8, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page as follows:

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

  - Add FILTER clause on aggregate functions per
    cockroachdb/cockroach#34077

  - Remove `experimental_optimizer_updates` cluster setting (can't find
    a commit for this, but I don't see it in `SHOW ALL` output on my
    local build of yesterday's `master`, version number is
    `v2.2.0-alpha.20181217-1096-gd104dcee69-dirty`.

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 11, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page as follows:

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

  - Add `SELECT`, `VALUES`, and `UNION` statements that do not include
    window functions

  - Add FILTER clause on aggregate functions per
    cockroachdb/cockroach#34077

  - Remove `experimental_optimizer_updates` cluster setting

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
rmloveland added a commit to cockroachdb/docs that referenced this pull request Feb 11, 2019
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page as follows:

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

  - Add `SELECT`, `VALUES`, and `UNION` statements that do not include
    window functions

  - Add FILTER clause on aggregate functions per
    cockroachdb/cockroach#34077

  - Remove `experimental_optimizer_updates` cluster setting

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants