-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tidy + optimize pl$DataFrame, pl$Series #385
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, always nice to see the internals become less complicated 😄
While you're refactoring |
it could be some syntactic sugar for If to support direct conversion from any R type to any polars type. That would be a big rework. Since casting is faster than R-polars conversion, it is probably not worth it. |
@sorhawell I took a stab at implementing the arg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I forgot LazyFrame()
was just a simple wrapper. I added a few tests and just waiting for CI to pass, thanks @sorhawell!
summary of refactoring:
remove internal is_DataFrame_data_input() guard from pl$DataFrame. Reason: not needed, error msg further deep in conversions are better. The guard could also prevent some valid conversion some R class which is not a list or a vector.
remove pl$DataFrame dead args parallel and via_select. These are not documented and does not do anything, and should have been removed with a previous PR. I would argue this should not count as breaking change to remove some exotic dead args.
add result_minimal, as_polars_series uses an error to signal internally no impl was found for a given R class or it does not work.
result_minimal
is light weight version of result which skips all the error upgrading/conversion to signal this.raise any other error from as_polars_series which was not due to missing implementation.
pl$Series drop not needed guard
if (inherits(x, "Series")) {...
internal conversion will do this also anyways. Removeconvert_to_fewer_types()
preprocessing which is now covered byas_polars_series.POSIXlt
.This is a follow up to PR #369 that brought optional support for vctrs or any other classed R object via the s3 method as_polars_seri.
When running with debug I see the implementation calls methods left and right and take 1ms! per column. Really only an issue for very wide tables. However, I propose a cleanup of the implementation which also is a bit faster .25ms per small column or so. I have also removed some old code not needed anymore in pl$DataFrame
and after
Close #336