Skip to content

[SPARK-45904][SQL][CONNECT] Mode function should supports sort with order direction#43786

Closed
beliefer wants to merge 1 commit intoapache:masterfrom
beliefer:SPARK-45904
Closed

[SPARK-45904][SQL][CONNECT] Mode function should supports sort with order direction#43786
beliefer wants to merge 1 commit intoapache:masterfrom
beliefer:SPARK-45904

Conversation

@beliefer
Copy link
Contributor

What changes were proposed in this pull request?

Currently, the mode aggregate function only support sort desc with the input column.
But the mainstream database (e.g. H2 and Postgres) supports sort desc or asc.

#43663 will support MODE() WITHIN GROUP (ORDER BY col [ASC|DESC]) syntax.

Why are the changes needed?

Mode function should supports sort with order direction.

Does this PR introduce any user-facing change?

'No'.
New feature.

How was this patch tested?

Test cases updated.

Was this patch authored or co-authored using generative AI tooling?

'No'.

Copy link
Contributor Author

@beliefer beliefer Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not worth let isSortAsc as the child expression of Mode due to it's always none or foldable.

@peter-toth
Copy link
Contributor

peter-toth commented Nov 14, 2023

Actually this is literally the same implementation I came up with internally before I added deterministic mode in #42755. But then we reduced the sort argument (isSortAsc in this PR) to a simple deterministic flag to keep the change simple.

@beliefer, are you referring to the isSortAsc argument of the mode function or are you referring to the new MODE() WITHIN GROUP (ORDER BY col [ASC|DESC]) syntax by "mainstream databases support"?

cc @cloud-fan, @srielau

@beliefer
Copy link
Contributor Author

beliefer commented Nov 14, 2023

@beliefer, are you referring to the isSortAsc argument of the mode function or are you referring to the new MODE() WITHIN GROUP (ORDER BY col [ASC|DESC]) syntax by "mainstream databases support"?

Because I want add the new MODE() WITHIN GROUP (ORDER BY col [ASC|DESC]) syntax supported by mainstream databases. Please see #43663.

I found that the current implementation of mode only supports sort by column ASC if we make deterministicExpr is true.
We should let mode also supports by column DESC

@beliefer
Copy link
Contributor Author

ping @MaxGekk cc @cloud-fan

@beliefer
Copy link
Contributor Author

The GA failure is unrelated.

* @since 4.0.0
*/
def mode(e: Column, deterministic: Boolean): Column = Column.fn("mode", e, lit(deterministic))
def mode(e: Column, isSortAsc: Boolean): Column = Column.fn("mode", e, lit(isSortAsc))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do want to support specify ordering, we should support ordering columns. I don't agree with this partial API change. Let's finish the SQL side first, then we can think about how the Scala API should look like.

@beliefer
Copy link
Contributor Author

I will close due to #44184 merged.

@beliefer beliefer closed this Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants