-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: check the pointer is valid #874
Conversation
It seems to have little effect on performance. Installed from 12d4bc2 library(polars)
countries = c(
"France", "Germany", "United Kingdom", "Japan", "Columbia",
"South Korea", "Vietnam", "South Africa", "Senegal", "Iran"
)
set.seed(123)
df_test = pl$DataFrame(
country = sample(countries, 1e7, TRUE),
x = sample(1:100, 1e7, TRUE),
y = sample(1:1000, 1e7, TRUE)
)
lf_test = df_test$lazy()
lazy_query = lf_test$
sort(pl$col("country"))$
filter(
pl$col("country")$is_in(pl$lit(c("United Kingdom", "Japan", "Vietnam")))
)
bench::mark(
eager = df_test$
sort(pl$col("country"))$
filter(
pl$col("country")$is_in(pl$lit(c("United Kingdom", "Japan", "Vietnam")))
),
lazy = lazy_query$collect(),
iterations = 10
)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eager 911ms 1.07s 0.934 14.56KB 0
#> 2 lazy 488ms 593.78ms 1.71 1.94KB 0 Created on 2024-03-01 with reprex v2.0.2 installed from main (bec40c6) library(polars)
countries = c(
"France", "Germany", "United Kingdom", "Japan", "Columbia",
"South Korea", "Vietnam", "South Africa", "Senegal", "Iran"
)
set.seed(123)
df_test = pl$DataFrame(
country = sample(countries, 1e7, TRUE),
x = sample(1:100, 1e7, TRUE),
y = sample(1:1000, 1e7, TRUE)
)
lf_test = df_test$lazy()
lazy_query = lf_test$
sort(pl$col("country"))$
filter(
pl$col("country")$is_in(pl$lit(c("United Kingdom", "Japan", "Vietnam")))
)
bench::mark(
eager = df_test$
sort(pl$col("country"))$
filter(
pl$col("country")$is_in(pl$lit(c("United Kingdom", "Japan", "Vietnam")))
),
lazy = lazy_query$collect(),
iterations = 10
)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eager 978ms 1.05s 0.906 12.46KB 0
#> 2 lazy 532ms 574.97ms 1.72 1.94KB 0 Created on 2024-03-01 with reprex v2.0.2 |
Hello! Allow me to give my two cents on this.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I confirm the test doesn't segfault locally. I'm gonna let @CGMossa and others discuss about the best method here, I don't have enough knowledge for that.
Can you bump NEWS?
Thanks! I actually saw that null check in cpp11, but didn't realize when this is actually needed until today :) https://github.com/r-lib/cpp11/blob/main/inst/include/cpp11/external_pointer.hpp I'll fix savvy first. |
@CGMossa Thanks for taking a look at this. |
I've been experimenting and fixing how we do externalptr in |
Here's my take. Hope this helps. library(savvyExamples)
rds_file <- tempfile()
x <- Person()
saveRDS(x, rds_file)
x <- readRDS(rds_file)
x$name()
#> Error: Invalid external pointer Created on 2024-03-01 with reprex v2.1.0 |
Fix #851
I feel this ideally be implemented in extendr (or savvy etc.), but I will implement it here anyway.
cc @yutannihilation @CGMossa @JosiahParry