[SPARK-29535][SQL] ADD some aggregate functions for Column in RelationalGroupedDataset.scala by TomokoKomiyama · Pull Request #26192 · apache/spark

TomokoKomiyama · 2019-10-21T09:22:36Z

What changes were proposed in this pull request?

Add five aggregation functions with Column type parameters.

mean(Column, Column*)
max(Column, Column*)
avg(Column, Column*)
min(Column, Column*)
sum(Column, Column*)

Why are the changes needed?

If we want pass Column type parameters to some aggregation functions with agg(), but it's redundant.

df.groupBy("_c0").agg(max($"_c1"))

Other aggregation functions such as pivot() can use Column arguments, but these aggregation functions can't use it.
df.groupBy("_c0").max($"_c1")

Does this PR introduce any user-facing change?

Yes.

We will be able to pass Column type parameters to aggregation functions(mean, max, avg, min, sum) without agg().
df.groupBy("_c0").max($"_c1")

How was this patch tested?

Manually tested.

AmplabJenkins · 2019-10-21T09:30:44Z

Can one of the admins verify this patch?

HyukjinKwon

I won't add APIs just for consistency. You can use agg, right?

TomokoKomiyama · 2019-10-23T01:33:59Z

@HyukjinKwon
We can use agg, but I think it would be easier for users to use these aggregation functions with Colmun type in the similar way with String one.
df.groupBy("_c0").max("_c1")
df.groupBy("_c0").max($"_c1")
Other functions can pass String type parameters because of the legacy reason and these will be removed someday, right?
I think it's better to pass Column type parameters to these aggregation functions(min, max...) so that the user is not confused at the time when legacy functions removed.

dongjoon-hyun · 2019-10-24T21:08:43Z

Sorry, @TomokoKomiyama . I also agree with @HyukjinKwon .

cc @sarutak

sarutak · 2019-10-25T13:24:33Z

@dongjoon-hyun She doesn't mind closing this PR.

maropu · 2019-10-25T13:51:31Z

+1, too. I'll close this. Thanks.

dongjoon-hyun · 2019-10-25T14:56:07Z

Thank you all! Especially, @TomokoKomiyama and @sarutak .
IIRC, there was a discussion before similarly and we made the same decision (to minimize the API.)

add aggregate functions for Column

40fc2ff

HyukjinKwon reviewed Oct 21, 2019

View reviewed changes

dongjoon-hyun added the SQL label Oct 22, 2019

maropu closed this Oct 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-29535][SQL] ADD some aggregate functions for Column in RelationalGroupedDataset.scala#26192

[SPARK-29535][SQL] ADD some aggregate functions for Column in RelationalGroupedDataset.scala#26192
TomokoKomiyama wants to merge 1 commit intoapache:masterfrom
TomokoKomiyama:add-col

TomokoKomiyama commented Oct 21, 2019

Uh oh!

AmplabJenkins commented Oct 21, 2019

Uh oh!

HyukjinKwon left a comment

Uh oh!

TomokoKomiyama commented Oct 23, 2019 •

edited

Loading

Uh oh!

dongjoon-hyun commented Oct 24, 2019

Uh oh!

sarutak commented Oct 25, 2019 •

edited

Loading

Uh oh!

maropu commented Oct 25, 2019

Uh oh!

dongjoon-hyun commented Oct 25, 2019 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

TomokoKomiyama commented Oct 21, 2019

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

AmplabJenkins commented Oct 21, 2019

Uh oh!

HyukjinKwon left a comment

Choose a reason for hiding this comment

Uh oh!

TomokoKomiyama commented Oct 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Oct 24, 2019

Uh oh!

sarutak commented Oct 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maropu commented Oct 25, 2019

Uh oh!

dongjoon-hyun commented Oct 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

TomokoKomiyama commented Oct 23, 2019 •

edited

Loading

sarutak commented Oct 25, 2019 •

edited

Loading

dongjoon-hyun commented Oct 25, 2019 •

edited

Loading