-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
group_by memory efficiency regression #4334
Labels
Milestone
Comments
This was referenced Apr 21, 2019
I guess we'll revise this when |
I see:
c(82210816, 312426496) / 1024
#> [1] 80284 305104 So we're back (approximately) to 0.7.8 sizes |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There seems to be memory inefficiency introduced somewhere between 0.7.8 and development version during grouping operation.
Using a 40 MB data frame:
To reproduce use following script:
Collect data size and max RSS memory
Run script using devel and 0.7.8.
On my machine I am getting following values
Issue has been initially spotted in a more complex query in which grouping was made only by 6 columns, thus I believe this issue is not related to number of columns to group but to cardinality.
The text was updated successfully, but these errors were encountered: