Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: Prune Delete operator input columns #34303

Merged
merged 5 commits into from Jan 31, 2019
Merged

Conversation

andy-kimball
Copy link
Contributor

Prune input columns that are not needed by a Delete operator. Needed
columns include returned columns and index key columns. All other columns
can be pruned.

Pruning Delete columns causes new interactions with the Delete "fast path"
that uses range delete. When multiple column families are in use, the
range delete needs to cover all column families. The "forDelete" flag
is used to construct spans that cover all columns, rather than just
"needed" columns. A new ConstructDeleteRange exec factory function sets
this flag correctly.

After this change, optimizer deletes no longer need to stay behind a
feature flag, as all fast paths should now work at least as well as they
do with the heuristic planner.

Release note: None

@andy-kimball andy-kimball requested a review from a team as a code owner January 28, 2019 19:36
@andy-kimball andy-kimball requested review from a team January 28, 2019 19:36
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andy-kimball
Copy link
Contributor Author

@jordanlewis, I've added a ConstructDeleteRange method that I think should hook into your new operator easily.

@jordanlewis
Copy link
Member

Nice. Should this come sequenced after the execution PR?

Copy link
Contributor Author

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume so, if you get yours in soon.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @justinj, @RaduBerinde, and @rytaft)

@jordanlewis
Copy link
Member

Mine is merged now.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: (modulo using the new delete range operator)

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @justinj, @RaduBerinde, and @rytaft)


pkg/sql/opt/metadata.go, line 220 at r3 (raw file):

// is passed separately so that its original formatting is preserved for error
// messages, pretty-printing, etc.
func (md *Metadata) AddTableWithAlias(tab cat.Table, alias tree.TableName) TableID {

[nit] most functions pass *TableName. We'd just make a copy so it shouldn't escape


pkg/sql/opt/cat/table.go, line 29 at r1 (raw file):

// Table is an interface to a database table, exposing only the information
// needed by the query optimizer.
type Table interface {

[nit] This comment could be extended with some general info on public, writable, deletable columns.


pkg/sql/opt/cat/table.go, line 43 at r1 (raw file):

	ColumnCount() int

	// WritableColumnCount returns the number of public and writable columns in

In these descriptions, "writable" and "deletable" each have two meanings (they either include the other columns or not). I would switch to using "write-only" and "delete-only" when referring to just those columns (consistent with the ImmutableTableDescriptor terminology). E.g. WritableColumnCount returns the number of public and write-only columns.


pkg/sql/opt/cat/table.go, line 73 at r2 (raw file):

	IndexCount() int

	// WritableIndexCount returns the number of public and writable indexes

Same suggestion, "write-only indexes".

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 25 of 25 files at r1, 5 of 5 files at r2, 15 of 15 files at r3, 20 of 20 files at r4.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @andy-kimball and @justinj)


pkg/sql/logictest/testdata/logic_test/delete, line 267 at r4 (raw file):

BEGIN; ALTER TABLE family ADD COLUMN z INT CREATE FAMILY

# Check that the new column is not usable in RETURNING

I think this comment is stale


pkg/sql/opt/cat/column.go, line 75 at r1 (raw file):

// the given ordinal position is a mutation column.
func IsMutationColumn(table Table, ord int) bool {
	return ord >= table.ColumnCount()

Do you also want to check that it's less than table.DeletableColumnCount()?


pkg/sql/opt/norm/testdata/rules/prune_cols, line 1653 at r4 (raw file):


# Prune when computed ordering column is present.
opt

do you want to add an expectation here?


pkg/sql/opt/norm/testdata/rules/prune_cols, line 1706 at r4 (raw file):


# Prune when mutation columns/indexes exist.
opt

and here?


pkg/sql/opt/optbuilder/insert.go, line 408 at r1 (raw file):

// addTargetTableColsForInsert adds up to maxCols columns to the list of columns
// that will be set by an INSERT operation. Non-mutation olumns are added from

Non-mutation olumns -> Non-mutation columns


pkg/sql/opt/optbuilder/insert.go, line 423 at r1 (raw file):

	// Only consider non-mutation columns, since mutation columns are hidden from
	// the SQL user.

Is this a change from before?

knz and others added 4 commits January 31, 2019 06:05
log.Fatal has an interesting logic to ensure it never returns,
together with a timeout in case something bad happens. This logic is
non-trivial and wasn't documented. This patch adds some doc.

Release note: None
Previously, the cat.Table.ColumnCount method returned a count that
included mutation columns. After this commit, it will no longer include
mutation columns. In addition, it adds new WritableColumnCount and
DeletableColumnCount methods to cat.Table. This makes the cat.Table
interface more like the underlying Table struct.

Release note: None
Add cat.Table.WritableIndexCount and cat.Table.DeletableIndexCount
methods to the catalog. These allow access to indexes that are in
process of being added or dropped to/from the table. This is needed
for mutation operator optimization.

Release note: None
Currently, TableMeta stores any table alias as a string. This commit changes
that to instead store the resolved TableName or query alias as a TableName.
While the current situation is OK, this change makes the alias more
convenient to use, and fits better with other parts of the code.

Release note: None
Copy link
Contributor Author

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @justinj and @rytaft)


pkg/sql/logictest/testdata/logic_test/delete, line 267 at r4 (raw file):

Previously, rytaft wrote…

I think this comment is stale

Done.


pkg/sql/opt/cat/column.go, line 75 at r1 (raw file):

Previously, rytaft wrote…

Do you also want to check that it's less than table.DeletableColumnCount()?

That'd be more of an assert, since only mutation columns are >= ColumnCount. I want this to be inlined, and don't want to add an assert panic in this code path, since it's used in contexts where we should be returning errors rather than panicking. I'm also not too worried about that case. So I'll leave as-is.


pkg/sql/opt/norm/testdata/rules/prune_cols, line 1653 at r4 (raw file):

Previously, rytaft wrote…

do you want to add an expectation here?

Done.


pkg/sql/opt/norm/testdata/rules/prune_cols, line 1706 at r4 (raw file):

Previously, rytaft wrote…

and here?

Done.


pkg/sql/opt/optbuilder/insert.go, line 408 at r1 (raw file):

Previously, rytaft wrote…

Non-mutation olumns -> Non-mutation columns

Done.


pkg/sql/opt/optbuilder/insert.go, line 423 at r1 (raw file):

Previously, rytaft wrote…

Is this a change from before?

No, just additional comment for clarification.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 47 files at r5, 13 of 25 files at r6, 1 of 5 files at r7, 9 of 15 files at r8, 23 of 23 files at r9.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @justinj)

Prune input columns that are not needed by a Delete operator. Needed
columns include returned columns and index key columns. All other columns
can be pruned.

Pruning Delete columns causes new interactions with the Delete "fast path"
that uses range delete. When multiple column families are in use, the
range delete needs to cover all column families. The "forDelete" flag
is used to construct spans that cover all columns, rather than just
"needed" columns. A new ConstructDeleteRange exec factory function sets
this flag correctly.

After this change, optimizer deletes no longer need to stay behind a
feature flag, as all fast paths should now work at least as well as they
do with the heuristic planner.

Release note (sql change): Delete statements are now planned by the cost-
based optimizer. The ORDER BY clause can no longer be used with a DELETE
statement when there is no LIMIT clause present. Sorting the output should
instead be done using SELECT..FROM [DELETE ..] ORDER BY ..
Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changefeed changes lgtm

@andy-kimball
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Jan 31, 2019
34303: opt: Prune Delete operator input columns r=andy-kimball a=andy-kimball

Prune input columns that are not needed by a Delete operator. Needed
columns include returned columns and index key columns. All other columns
can be pruned.

Pruning Delete columns causes new interactions with the Delete "fast path"
that uses range delete. When multiple column families are in use, the
range delete needs to cover all column families. The "forDelete" flag
is used to construct spans that cover all columns, rather than just
"needed" columns. A new ConstructDeleteRange exec factory function sets
this flag correctly.

After this change, optimizer deletes no longer need to stay behind a
feature flag, as all fast paths should now work at least as well as they
do with the heuristic planner.

Release note: None


Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Jan 31, 2019

Build failed

@andy-kimball
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Jan 31, 2019
34303: opt: Prune Delete operator input columns r=andy-kimball a=andy-kimball

Prune input columns that are not needed by a Delete operator. Needed
columns include returned columns and index key columns. All other columns
can be pruned.

Pruning Delete columns causes new interactions with the Delete "fast path"
that uses range delete. When multiple column families are in use, the
range delete needs to cover all column families. The "forDelete" flag
is used to construct spans that cover all columns, rather than just
"needed" columns. A new ConstructDeleteRange exec factory function sets
this flag correctly.

After this change, optimizer deletes no longer need to stay behind a
feature flag, as all fast paths should now work at least as well as they
do with the heuristic planner.

Release note: None


Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Jan 31, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants