-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement as_polars_series()
for some clock package types
#861
Conversation
as_polars_series(<clock_time_point>)
as_polars_series()
for some clock package types
9c2e030
to
78f7d07
Compare
If this is less performant than string parsing, it is not worth the complexity of the implementation so I will do some benchmarking later. |
4dc9f27
to
12b2e0a
Compare
12b2e0a
to
d60c3cc
Compare
503ed6b
to
f2719ec
Compare
In my environment, this implementation seems to be faster for more than 100,000 rows. library(clock)
library(polars)
time_clock <- seq_len(10^5) |>
as.POSIXct(tz = "UTC") |>
as_zoned_time()
bench::mark(
via_str = {
time_clock |>
as.character() |>
(\(x) as_polars_series(x)$str$
replace(r"(\[.*\])", "")$str$
strptime(
pl$Datetime("ms"), "%Y-%m-%dT%H:%M:%S%:z"
))()
},
via_double = {
as_polars_series(time_clock)
},
check = FALSE
)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 via_str 712ms 712ms 1.40 6.9MB 1.40
#> 2 via_double 127ms 145ms 7.01 5.98MB 1.75 Created on 2024-02-29 with reprex v2.0.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really no expert on datetime handling so I don't have much to say about this, but LGTM
Just saw the conversation in the linked issue to the |
I think we can leave it as is for now and replace it with an implementation via string when the clock is updated because of the performance benefits as commented above. |
Part of #591
It looks like it could be implemented on the R side only.
There is room for improvement in performance because of the use of when-then-otherwise to handle overflow when converting UInt64 to Int64, but I think it is a good starting point.
Created on 2024-02-28 with reprex v2.0.2