Skip to content

ARROW-6657: [Rust] [DataFusion] Add Count Aggregate Expression#5513

Closed
sinistersnare wants to merge 2 commits intoapache:masterfrom
sinistersnare:ARROW-6657
Closed

ARROW-6657: [Rust] [DataFusion] Add Count Aggregate Expression#5513
sinistersnare wants to merge 2 commits intoapache:masterfrom
sinistersnare:ARROW-6657

Conversation

@sinistersnare
Copy link
Copy Markdown
Contributor

Hi, I added this code, and the tests pass. I still need to actually test it using a real example, so I would say its not completely ready for merge yet.

@paddyhoran
Copy link
Copy Markdown
Contributor

Hi @sinistersnare thanks for this! Just let us know when you think it is ready for review (or if you have any questions)?

Copy link
Copy Markdown
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Please add SQL tests to context.rs based on the ones for SUM:

https://github.com/apache/arrow/blob/master/rust/datafusion/src/execution/context.rs#L616-L642

@sinistersnare
Copy link
Copy Markdown
Contributor Author

Those are exactly the tests I was looking for. Thanks, I will push an update tonight!

@andygrove
Copy link
Copy Markdown
Member

@sinistersnare I see you merged master into your branch .. that can lead to issues because we don't use a merging model on this repo. See https://andygrove.io/apache_arrow_git_tips/ for more info.

@sinistersnare
Copy link
Copy Markdown
Contributor Author

Took a bit longer than expected (moving currently), but I added some SQL tests! Aside from my worry from above, I think I am ready for this.

@sinistersnare
Copy link
Copy Markdown
Contributor Author

Fixed the style errors too, @andygrove @paddyhoran this should be good-to-go!

Copy link
Copy Markdown
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @sinistersnare

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just spotted an issue with this, and I have the same issue with the SumExpr implementation ... we are evaluating the expression against the whole batch multiple times (once for every row in the batch). This is a design flaw in the accumulator trait I guess. I'll give this some thought today.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best if we merge this in without this optimization/fix, so you can simply fix both instances at the same time?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at the proposed fix in #5542 and let me know what you think. I'd prefer to get this reviewed and merged first, then you can rebase this PR and implement the changes.

@github-actions
Copy link
Copy Markdown

@andygrove
Copy link
Copy Markdown
Member

@sinistersnare Please rebase against the latest master and I can approve and merge

@sinistersnare
Copy link
Copy Markdown
Contributor Author

@andygrove updated!

Copy link
Copy Markdown
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending CI

@andygrove andygrove closed this in 368562b Oct 4, 2019
@sinistersnare sinistersnare deleted the ARROW-6657 branch October 4, 2019 22:20
kszucs pushed a commit that referenced this pull request Oct 5, 2019
Hi, I added this code, and the tests pass. I still need to actually test it using a real example, so I would say its not completely ready for merge yet.

Closes #5513 from sinistersnare/ARROW-6657 and squashes the following commits:

64d0c00 <Andy Grove> formatting
12d0c2c <Davis Silverman> Add Count Aggregate Expression

Lead-authored-by: Davis Silverman <sinistersnare@gmail.com>
Co-authored-by: Andy Grove <andygrove73@gmail.com>
Signed-off-by: Andy Grove <andygrove73@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants