Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: implement WITH RECURSIVE with UNION #71685

Merged
merged 2 commits into from Oct 21, 2021

Conversation

RaduBerinde
Copy link
Member

@RaduBerinde RaduBerinde commented Oct 19, 2021

sql: implement WITH RECURSIVE with UNION

This change implements the UNION variant of WITH RECURSIVE, where rows
are deduplicated. We achieve this by storing all rows in a
deduplicating container and inserting in that container first,
detecting if the row is a duplicate.

Fixes #46642.

Release note (sql change): The WITH RECURSIVE variant that uses UNION
(as opposed to UNION ALL) is now supported.

sql: capitalize rowContainerHelper API

This change capitalizes the API of rowContainerHelper and
rowContainerIterator to make it more clear what functions are not
internal to the structure.

Release note: None

@RaduBerinde RaduBerinde requested a review from a team as a code owner October 19, 2021 02:22
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde
Copy link
Member Author


pkg/sql/opt/optbuilder/testdata/with, line 1184 at r1 (raw file):

) SELECT * FROM cte;
----
with &2 (cte)

This diff looks like out of left field, but before this change we didn't even try to build it as a recursive CTE. Now we do and then realize it's not really recursive. The column IDs now match the case above.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I only have nits, so :lgtm:

Reviewed 15 of 15 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)


pkg/sql/apply_join.go, line 254 at r1 (raw file):

}

// runPlanInsidePlan is used to run a plan and gather the results in a row

nit: the comment should now mention the resultWriter.


pkg/sql/buffer_util.go, line 37 at r1 (raw file):

}

func (c *rowContainerHelper) init(

nit: do you think it'd be cleaner to keep the signature of init unchanged but to introduce initWithDeduplication that would take the new argument and init would call it passing in false?


pkg/sql/buffer_util.go, line 75 at r1 (raw file):

}

// addRowWithDedup adds a given row if not already present in the container.

super nit: I think s/a given row/the given row/.


pkg/sql/buffer_util.go, line 79 at r1 (raw file):

func (c *rowContainerHelper) addRowWithDedup(
	ctx context.Context, row tree.Datums,
) (ok bool, _ error) {

super nit: maybe do s/ok/added/ to be more explicit about the meaning of the boolean?


pkg/sql/recursive_cte.go, line 179 at r1 (raw file):

//
// If we are deduplicating, each row is either discarded if it has a duplicate
// in the allRows container or added to both alRows and workingRows otherwise.

nit: s/alRows/allRows/.


pkg/sql/opt/exec/factory.opt, line 693 at r1 (raw file):

#     - the plan is executed; the results are emitted and also saved in a new
#       buffer for the next iteration. If Deduplicate is true, only rows that
#       haven't been returned yet are saved.

nit: maybe s/are saved/are returned and saved/.


pkg/sql/opt/optbuilder/with.go, line 237 at r1 (raw file):

	initial, recursive, isUnionAll, ok := b.splitRecursiveCTE(cte.Stmt)
	// We don't currently support the UNION form (only UNION ALL).

nit: needs an update.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!! It would be nice to include the variant in ordinary EXPLAIN as well, unless I'm missing something this will still just say recursive-cte and not include the detail about UNION vs UNION ALL.

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs! Added a field to EXPLAIN VERBOSE.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @yuzefovich)


pkg/sql/buffer_util.go, line 37 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: do you think it'd be cleaner to keep the signature of init unchanged but to introduce initWithDeduplication that would take the new argument and init would call it passing in false?

Done. I also added a commit that capitalizes the API.


pkg/sql/buffer_util.go, line 75 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: I think s/a given row/the given row/.

Done.


pkg/sql/buffer_util.go, line 79 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: maybe do s/ok/added/ to be more explicit about the meaning of the boolean?

Done.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

nit: the second commit is missing from the PR description

Reviewed 7 of 15 files at r1, 10 of 10 files at r2, 7 of 7 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @RaduBerinde, and @yuzefovich)


pkg/sql/buffer_util.go, line 64 at r2 (raw file):

	}
	c.rows.Init(
		ordering, typs, &evalContext.EvalContext,

Does DoDeDuplicate require a non-empty ordering? It's unclear in the comments for DiskBackedRowContainer.

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @yuzefovich)


pkg/sql/buffer_util.go, line 64 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Does DoDeDuplicate require a non-empty ordering? It's unclear in the comments for DiskBackedRowContainer.

AFAICT it deduplicates on the columns in the ordering

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @RaduBerinde, and @yuzefovich)


pkg/sql/buffer_util.go, line 64 at r2 (raw file):

Previously, RaduBerinde wrote…

AFAICT it deduplicates on the columns in the ordering

I see. Maybe add a comment here to explain that?

This change implements the UNION variant of WITH RECURSIVE, where rows
are deduplicated. We achieve this by storing all rows in a
deduplicating container and inserting in that container first,
detecting if the row is a duplicate.

Fixes cockroachdb#46642.

Release note (sql change): The WITH RECURSIVE variant that uses UNION
(as opposed to UNION ALL) is now supported.
This change capitalizes the API of `rowContainerHelper` and
`rowContainerIterator` to make it more clear what functions are not
internal to the structure.

Release note: None
Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @yuzefovich)


pkg/sql/buffer_util.go, line 64 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I see. Maybe add a comment here to explain that?

Done, also added to DoDeDuplicate.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r4, 7 of 7 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @yuzefovich)


pkg/sql/buffer_util.go, line 64 at r2 (raw file):

Previously, RaduBerinde wrote…

Done, also added to DoDeDuplicate.

Thanks!

@RaduBerinde
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 20, 2021

Build failed:

@RaduBerinde
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 21, 2021

Build succeeded:

@craig craig bot merged commit 7265486 into cockroachdb:master Oct 21, 2021
@RaduBerinde RaduBerinde deleted the recursive-cte-union branch October 25, 2021 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: UNION variant of WITH RECURSIVE not implemented
5 participants