Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix distsql planning issue with optimizer (streaming aggregation) #31825

Merged
merged 1 commit into from Oct 25, 2018

Conversation

RaduBerinde
Copy link
Member

@RaduBerinde RaduBerinde commented Oct 24, 2018

The distsql planner looks at physical properties of various plan nodes
to figure out what orderings it needs to preserve when merging data
from multiple nodes. These are passed as required orderings in the
exec.Factory, but not for all operators - only those which would
actually matter.

There is a code path related to setting up streaming aggregations
which was unaccounted for which looks at the ordering of the input
node.

To fix this, we now pass required orderings to filters and
projections. I looked at all the operators and the rest either:

  • never produce an ordering as far as the optimizer is concerned
    (VirtualScan, HashJoin, ScalarGroupBy, SetOp, ProjectSet); or
  • they pass through the input ordering unchanged (Distinct); or
  • they cannot be distributed (Limit, Ordinality).

Release note (bug fix): Fixed an issue which causes invalid results or
an "incorrectly ordered stream" error with streaming aggregations (in
some cases).

@RaduBerinde RaduBerinde requested review from rytaft, andy-kimball and a team October 24, 2018 19:26
@RaduBerinde RaduBerinde requested a review from a team as a code owner October 24, 2018 19:26
@RaduBerinde RaduBerinde requested a review from a team October 24, 2018 19:26
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde RaduBerinde force-pushed the distsql-agg-ordering-bug branch 2 times, most recently from c028b93 to 8cf5e62 Compare October 24, 2018 20:21
Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 13 of 13 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/exec/execbuilder/relational_builder.go, line 98 at r1 (raw file):

}

// reqOrdering converts an OrderingChoice to a ColumnOrdering (according to the

reqOrdering -> sqlOrderingFromChoice

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained


pkg/sql/opt/exec/execbuilder/relational_builder.go, line 98 at r1 (raw file):

Previously, rytaft wrote…

reqOrdering -> sqlOrderingFromChoice

Done.

@RaduBerinde
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 25, 2018

Build failed

The distsql planner looks at physical properties of various plan nodes
to figure out what orderings it needs to preserve when merging data
from multiple nodes. These are passed as required orderings in the
exec.Factory, but not for all operators - only those for which it
would actually matter.

There is a code path related to setting up streaming aggregations
which was unaccounted for which looks at the ordering of the input
node.

To fix this, we now pass required orderings to filters and
projections. I looked at all the operators and the rest either:
 - never produce an ordering as far as the optimizer is concerned
   (VirtualScan, HashJoin, ScalarGroupBy, SetOp, ProjectSet); or
 - they pass through the input ordering unchanged (Distinct); or
 - they cannot be distributed (Limit, Ordinality).

Release note (bug fix): Fixed an issue which causes invalid results or
an "incorrectly ordered stream" error with streaming aggregations (in
some cases).
@RaduBerinde
Copy link
Member Author

bors r+

craig bot pushed a commit that referenced this pull request Oct 25, 2018
31825: sql: fix distsql planning issue with optimizer (streaming aggregation) r=RaduBerinde a=RaduBerinde

The distsql planner looks at physical properties of various plan nodes
to figure out what orderings it needs to preserve when merging data
from multiple nodes. These are passed as required orderings in the
exec.Factory, but not for all operators - only those which would
actually matter.

There is a code path related to setting up streaming aggregations
which was unaccounted for which looks at the ordering of the input
node.

To fix this, we now pass required orderings to filters and
projections. I looked at all the operators and the rest either:
 - never produce an ordering as far as the optimizer is concerned
   (VirtualScan, HashJoin, ScalarGroupBy, SetOp, ProjectSet); or
 - they pass through the input ordering unchanged (Distinct); or
 - they cannot be distributed (Limit, Ordinality).

Release note (bug fix): Fixed an issue which causes invalid results or
an "incorrectly ordered stream" error with streaming aggregations (in
some cases).

Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Oct 25, 2018

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants