Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table.group_by #3305

Merged
merged 20 commits into from
Mar 1, 2022
Merged

Table.group_by #3305

merged 20 commits into from
Mar 1, 2022

Conversation

jdunkerley
Copy link
Member

@jdunkerley jdunkerley commented Feb 25, 2022

Pull Request Description

Functioning group_by based of Enso Map.

Important Notes

This is an initial version which will be used to establish the API.
The grouping map will need to be moved to Java code for performance.

Checklist

Please include the following checklist in your PR:

  • The documentation has been updated if necessary.
  • All code conforms to the Scala, Java, and Rust style guides.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed: Enso GUI was tested when built using BOTH ./run dist and ./run watch.

Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Some style comments and questions but given it's a prototype, most are just suggestions for the future.

The important thing is the Group_By_Key - IMO it should not repeat the implementations of existing functions if it can delegate to them unless it is really absolutely necessary.

- population argument specifies if group is a sample or the population
type Standard_Deviation (column:Column|Text|Integer) (name:Text|Nothing=Nothing) (population:Boolean=False)

## Creates a new column with the values concatenated together. NULL values will become an empty string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. why is NULL collated with "" here instead of being ignored like in other cases ?

Comment on lines +543 to +546
array = new_table.at (j + key_length) . at 1 . to_array
current = array . at row_index
new = aggregator current i
array . set_at row_index new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may stop working when we fix the "to_array is leaking mutability" issue.

As this prototype may live shorter than the above issue - probably ignore this comment.

But just for future - maybe we could just create an Expandable_Array which wraps a regular, explicitly-mutable array but also has the append operation and does the allocations that give us O(1) amortized complexity of append. Because I don't like that we are abusing the Vector_Builder a bit when we could just create a data structure meant for such use-cases. But since this is a prototype - just commenting a suggestion for future.

@jdunkerley jdunkerley requested a review from radeusgd March 1, 2022 15:06
@jdunkerley jdunkerley marked this pull request as ready for review March 1, 2022 15:06
@jdunkerley jdunkerley requested a review from 4e6 as a code owner March 1, 2022 15:06
Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@jdunkerley jdunkerley added the CI: Ready to merge This PR is eligible for automatic merge label Mar 1, 2022
@mergify mergify bot merged commit 738a691 into develop Mar 1, 2022
@mergify mergify bot deleted the wip/jd/group-by-181268783 branch March 1, 2022 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants