-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: Prune Update and Upsert input columns #34522
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 16 of 16 files at r1, 16 of 16 files at r2, 28 of 28 files at r3, 12 of 12 files at r4.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @justinj, and @RaduBerinde)
pkg/sql/opt_catalog.go, line 364 at r1 (raw file):
stats []optTableStat // family is the inlined wrapper for the table's primary family, where
this comment got cut off.
pkg/sql/opt_catalog.go, line 367 at r1 (raw file):
primaryFamily optFamily // families is a cache of family wrappers (all except default family) that's
is default family the same as primaryFamily?
pkg/sql/opt_exec_factory.go, line 1077 at r3 (raw file):
var fkCheckType row.FKCheckType if len(updateColDescs) == 0 { fkCheckType = row.CheckInserts
Is there no longer a possibility that we only need to check inserts?
pkg/sql/opt/cat/family.go, line 1 at r1 (raw file):
// Copyright 2018 The Cockroach Authors.
[nit] 2019
pkg/sql/opt/cat/table.go, line 128 at r1 (raw file):
// FamilyCount returns the number of column families present on the table. // There is always at least one primary family (always family 0) where columns // go if they are not explicitly assigned to another family.
What if all columns are explicitly assigned? Is there still a primary family?
pkg/sql/opt/optbuilder/mutation_builder.go, line 413 at r2 (raw file):
if needResults { // Only non-mutation columns are output columns. private.ReturnCols = make(opt.ColList, mb.tab.DeletableColumnCount())
shouldn't this be ColumnCount()
?
pkg/sql/opt/testutils/testcat/create_table.go, line 72 at r1 (raw file):
} // Add non-mutation columns and families.
I don't see where the families are getting added here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @justinj, @RaduBerinde, and @rytaft)
pkg/sql/opt_catalog.go, line 364 at r1 (raw file):
Previously, rytaft wrote…
this comment got cut off.
Done.
pkg/sql/opt_catalog.go, line 367 at r1 (raw file):
Previously, rytaft wrote…
is default family the same as primaryFamily?
Done.
pkg/sql/opt_exec_factory.go, line 1077 at r3 (raw file):
Previously, rytaft wrote…
Is there no longer a possibility that we only need to check inserts?
No, there shouldn't be any such possibility. We map that case into an Insert
.
pkg/sql/opt/cat/family.go, line 1 at r1 (raw file):
Previously, rytaft wrote…
[nit] 2019
Done.
pkg/sql/opt/cat/table.go, line 128 at r1 (raw file):
Previously, rytaft wrote…
What if all columns are explicitly assigned? Is there still a primary family?
The primary family is always the first family that was explicitly defined, or if no families were defined, then a primary family is synthesized. I added to the comment.
pkg/sql/opt/optbuilder/mutation_builder.go, line 413 at r2 (raw file):
Previously, rytaft wrote…
shouldn't this be
ColumnCount()
?
No, the length of these column lists is always DeletableColumnCount
, but missing columns are set to 0. So in this case the length is DeletableColumnCount
, but we only fill in the first ColumnCount
entries. I added to the comment.
pkg/sql/opt/testutils/testcat/create_table.go, line 72 at r1 (raw file):
Previously, rytaft wrote…
I don't see where the families are getting added here.
Done.
Add Family collection to cat.Table. This information will be used by the Update column pruning code. Release note: None
Previously, the list of RETURNING columns was inferred for mutation operators. However, in order to make pruning of input columns easier, and to prepare for a future when ReturnCols can be pruned, this commit makes the list explicit. For now, the new ReturnCols list is never pruned. Additional execution support is required for that. Release note: None
Prune input columns that are not needed by Update and Upsert operators. Needed columns include returned columns and columns needed to formulate KV Put and Delete operations to implement the mutations. All other columns can be pruned. This commit contains new code to enable the Upsert fast path with the CBO. The fast path uses the KV Put method to blindly insert new rows or overwrite existing rows. Because any existing data is ignored, it is not necessary to fetch existing rows, as is normally required. This also allows a much simpler logical plan. After this change, the optimizer updates/upserts no longer need to stay behind a feature flag, as all fast paths should now work at least as well as they do with the heuristic planner. Release note (sql change): Delete, Update, and Upsert statements are now planned by the cost-based optimizer.
Release note: None
bors r+ |
Merge conflict (retrying...) |
34522: opt: Prune Update and Upsert input columns r=andy-kimball a=andy-kimball Prune input columns that are not needed by Update and Upsert operators. Needed columns include returned columns and columns needed to formulate KV Put and Delete operations to implement the mutations. All other columns can be pruned. This commit contains new code to enable the Upsert fast path with the CBO. The fast path uses the KV Put method to blindly insert new rows or overwrite existing rows. Because any existing data is ignored, it is not necessary to fetch existing rows, as is normally required. This also allows a much simpler logical plan. After this change, the optimizer updates/upserts no longer need to stay behind a feature flag, as all fast paths should now work at least as well as they do with the heuristic planner. Release note (sql change): Delete, Update, and Upsert statements are now planned by the cost-based optimizer. 34529: sql: enable automatic statistics by default r=rytaft a=rytaft This commit changes the default value of the cluster setting `sql.stats.experimental_automatic_collection.enabled` to true. As a result, automatic statistics collection is now enabled by default. It can still be disabled by setting `sql.stats.experimental_automatic_collection.enabled=false`. Release note (sql change): Enabled automatic statistics collection. Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com> Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
Build succeeded |
A bit more work to address #3998. Summary of changes: - Edit the 'Supported statements' section of the ['Cost-based Optimizer'][1] page to add: - Add FILTER per cockroachdb/cockroach#34077 - Add DELETE per cockroachdb/cockroach#34522 - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339 [1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
A bit more work to address #3998. Summary of changes: - Edit the 'Supported statements' section of the ['Cost-based Optimizer'][1] page as follows: - Add FILTER per cockroachdb/cockroach#34077 - Add DELETE per cockroachdb/cockroach#34522 - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339 - Remove `experimental_optimizer_updates` cluster setting (can't find a commit for this, but I don't see it in `SHOW ALL` output on my local build of yesterday's `master`, version number is `v2.2.0-alpha.20181217-1096-gd104dcee69-dirty`. [1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
A bit more work to address #3998. Summary of changes: - Edit the 'Supported statements' section of the ['Cost-based Optimizer'][1] page as follows: - Add DELETE per cockroachdb/cockroach#34522 - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339 - Add FILTER clause on aggregate functions per cockroachdb/cockroach#34077 - Remove `experimental_optimizer_updates` cluster setting (can't find a commit for this, but I don't see it in `SHOW ALL` output on my local build of yesterday's `master`, version number is `v2.2.0-alpha.20181217-1096-gd104dcee69-dirty`. [1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
A bit more work to address #3998. Summary of changes: - Edit the 'Supported statements' section of the ['Cost-based Optimizer'][1] page as follows: - Add DELETE per cockroachdb/cockroach#34522 - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339 - Add `SELECT`, `VALUES`, and `UNION` statements that do not include window functions - Add FILTER clause on aggregate functions per cockroachdb/cockroach#34077 - Remove `experimental_optimizer_updates` cluster setting [1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
A bit more work to address #3998. Summary of changes: - Edit the 'Supported statements' section of the ['Cost-based Optimizer'][1] page as follows: - Add DELETE per cockroachdb/cockroach#34522 - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339 - Add `SELECT`, `VALUES`, and `UNION` statements that do not include window functions - Add FILTER clause on aggregate functions per cockroachdb/cockroach#34077 - Remove `experimental_optimizer_updates` cluster setting [1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
Prune input columns that are not needed by Update and Upsert operators.
Needed columns include returned columns and columns needed to formulate
KV Put and Delete operations to implement the mutations. All other
columns can be pruned.
This commit contains new code to enable the Upsert fast path with the
CBO. The fast path uses the KV Put method to blindly insert new rows
or overwrite existing rows. Because any existing data is ignored, it is
not necessary to fetch existing rows, as is normally required. This
also allows a much simpler logical plan.
After this change, the optimizer updates/upserts no longer need to stay
behind a feature flag, as all fast paths should now work at least as
well as they do with the heuristic planner.
Release note (sql change): Delete, Update, and Upsert statements are
now planned by the cost-based optimizer.