-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 methods for DataFrame and LazyFrame: simplest #107
Conversation
I think e.g. nrow cannot be inferred without collect, however it could be implemented something like where only the count is returned. If polars will optimize away if possible. x$count()$collect() |
I have completed all I planned to achieve in this PR, so feel free to take a look when you find some time. As always, I'm happy to make any changes you'd like. The cool thing about my two recent PRs is that it allowed me to considerably simplify the Get Started vignette (@grantmcdermott may want to take a look). Here's a rendered version for your convenience: https://arelbundock.com/polars.html Here are the main things I changed in the vignette:
|
"For what it’s worth, the previous query could have been written more concisely as" dat$with_columns(
pl$col(c("mpg", "hp"))$sum()$over("cyl")$prefix("sum_")
) #BTW also possible but not fully stabilized, so do not recommend in tutorial # has currenyly a thread-safety user warning. But could be removed now.
dat$with_columns(
pl$col(c("mpg", "hp"))$sum()$over("cyl")$map_alias(\(x) paste0("sum_",x))
)
# has not stabilized becaues does not work well with wild-cards like $all()
pl$set_polars_options(named_exprs = TRUE) # must be activted like this first
dat$with_columns(
sum_mpg = pl$col("mpg")$sum()$over("cyl"),
sum_hp = pl$col("hp")$sum()$over("cyl")
)
#could be fun to try allow this also :) as an opt-in feature. All column names converted to `pl$col(x)` and in scope of the method. Not sure it would work well for lazy API though.
dat$with_columns(
sum_mpg = mpg$sum()$over(cyl),
sum_hp = hp$sum()$over(cyl)
) |
Half-way through the revised tutorial . It has really been a pleasure to read. I agree with your changes. I will add a few syntax edits. |
Agree, lots to like here. Will leave a few comments for you to think about. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots to like here @vincentarelbundock.
One thing I haven't mentioned as a specific comment---but I think is good for consistency---is sticking to the hard 80 character wrapping. I know that you're a fellow vim user and this is the hook I always use ;-)
vignettes/polars.Rmd
Outdated
and/or regular R vectors. | ||
## Methods and pipelines | ||
|
||
Although some simple R functions work out of the box on **polars** objects, the full power of Polars is realized via _methods_. Polars methods are accessed using the `$` syntax. For example, to convert Polars `Series` and `DataFrames` back to standard R objects, we use the `$to_r_vector()` and `$as_data_frame()` methods: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been a while since I tried this, but is as_data_frame
still a tibble-exported (or maybe arrow-exported) function? Just worth thinking ahead to whether users might run into a NAMESPACE clash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no clash between exported functions and polars methods, though, right? And this PR adds a proper as.data.frame()
s3 for polars objects, which just calls df$as_data_frame()
under the hood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool cool. Just wanted to check.
FYI, I believe I have addressed all the points raised in review so far. Feel free to make more requests if needed. |
Many thanks @vincentarelbundock :) |
Issue #104
Notes:
@noRd
tags because I did not want to pollute the manual. But for some reason skippingnrow
andncol
ledcheck()
to issue a warning, so I included minimal docs for just those two.TODO in a different PR:
This PR implements a few of the main base R S3 methods for
DataFrame
andLazyFrame
objects. For example: