Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test for example where $over(..., order_by = ...) is useful #124

Open
etiennebacher opened this issue Jul 8, 2024 · 1 comment
Open
Labels
upstream This requires changes in polars or in a tidyverse package.

Comments

@etiennebacher
Copy link
Owner

See examples added in pola-rs/r-polars#1147

@etiennebacher etiennebacher changed the title Add test for example where $over(... order_by = ...) is useful Add test for example where $over(..., order_by = ...) is useful Jul 8, 2024
@etiennebacher
Copy link
Owner Author

I think not possible because I can't join several $over() in the same expression: pola-rs/polars#12051

library(polars)
library(dplyr, warn.conflicts = FALSE)
options(polars.do_not_repeat_call = TRUE)

df = pl$DataFrame(
  g = c(1, 1, 1, 1, 2, 2, 2, 2),
  t = c(1, 2, 3, 4, 4, 1, 2, 3),
  x = c(10, 20, 30, 40, 10, 20, 30, 40)
)

df$with_columns(
  x_lag = pl$col("x")$shift(1)$over("g", order_by = "t")
)
#> shape: (8, 4)
#> ┌─────┬─────┬──────┬───────┐
#> │ g   ┆ t   ┆ x    ┆ x_lag │
#> │ --- ┆ --- ┆ ---  ┆ ---   │
#> │ f64 ┆ f64 ┆ f64  ┆ f64   │
#> ╞═════╪═════╪══════╪═══════╡
#> │ 1.0 ┆ 1.0 ┆ 10.0 ┆ null  │
#> │ 1.0 ┆ 2.0 ┆ 20.0 ┆ 10.0  │
#> │ 1.0 ┆ 3.0 ┆ 30.0 ┆ 20.0  │
#> │ 1.0 ┆ 4.0 ┆ 40.0 ┆ 30.0  │
#> │ 2.0 ┆ 4.0 ┆ 10.0 ┆ 40.0  │
#> │ 2.0 ┆ 1.0 ┆ 20.0 ┆ null  │
#> │ 2.0 ┆ 2.0 ┆ 30.0 ┆ 20.0  │
#> │ 2.0 ┆ 3.0 ┆ 40.0 ┆ 30.0  │
#> └─────┴─────┴──────┴───────┘
df$with_columns(
  x_lag = (pl$col("x")$shift(1)$over(order_by = "t"))$over("g")
)
#> Error: Execution halted with the following contexts
#>    0: In R: in $with_columns()
#>    1: Encountered the following error in Rust-Polars:
#>          invalid operation: window expression not allowed in aggregation
df |> 
  as_tibble() |> 
  mutate(
    x_lag = lag(x, order_by = t),
    .by = g
  )
#> # A tibble: 8 × 4
#>       g     t     x x_lag
#>   <dbl> <dbl> <dbl> <dbl>
#> 1     1     1    10    NA
#> 2     1     2    20    10
#> 3     1     3    30    20
#> 4     1     4    40    30
#> 5     2     4    10    40
#> 6     2     1    20    NA
#> 7     2     2    30    20
#> 8     2     3    40    30

@etiennebacher etiennebacher added the upstream This requires changes in polars or in a tidyverse package. label Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream This requires changes in polars or in a tidyverse package.
Projects
None yet
Development

No branches or pull requests

1 participant