Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Avoid split.data.frame if there's only one panel #4991

Merged
merged 2 commits into from
Jan 5, 2023

Conversation

zeehio
Copy link
Contributor

@zeehio zeehio commented Sep 16, 2022

Splitting a data frame according to a factor with one level will give a list with the data frame.

On layers with 1E7 data points, this split call costs 1.7 seconds out of the 18 seconds it currently takes me to build the plot. I intend to drive those 18 seconds to 6 seconds or less.

I have (maybe too extensively) documented my progress at:

This is the second low hanging fruit after #4990.

Happy to further discuss this 😃

Proposed NEWS entry (not added to avoid trivial conflicts)

* Performance improvement on plots without facets and lots of data
  points (@zeehio, #4991)
``

R/geom-.r Outdated Show resolved Hide resolved
@teunbrand
Copy link
Collaborator

This looks like a sensible step to me in general. Out of scope for this PR however, this makes me wonder whether to apply this trick more broadly. Or even switch to a vctrs::vec_split() based approach, which even for full data.frame splits seem to be about half as expensive as base::split.data.frame().

@zeehio
Copy link
Contributor Author

zeehio commented Dec 24, 2022

Thanks for the code review, @teunbrand .

I applied your suggestion for coercing to factors. I did not go for replacing base::split.data.frame() with vctrs::vec_split() here. It probably makes sense to replace split() with vctrs::vec_split() in multiple places across ggplot2, so I'd rather take care of all those replacements at once, if needed, in a different PR.

Copy link
Collaborator

@teunbrand teunbrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

@teunbrand teunbrand merged commit 853266c into tidyverse:main Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants