Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can bind_rows_polars() takes a named list like bind_rows() does for the .id argument? #116

Closed
ginolhac opened this issue May 16, 2024 · 3 comments
Labels

Comments

@ginolhac
Copy link

What functionality are you missing?

I often use bind_rows() with a named list to have meaningful names instead of ids as integers

Example:

t1 <- tibble(
  x = c("a", "b"),
  y = 1:2
)
t2 <- tibble(
  x = c("c", "d"),
  y = 3:4
)
bind_rows(t1 = t1, tib2 = t2, .id = "id")

gives the chosen names for ids:

# A tibble: 4 × 3
  id    x         y
  <chr> <chr> <int>
1 t1    a         1
2 t1    b         2
3 tib2  c         3
4 tib2  d         4

Is this functionality present in the tidyverse or in polars (or both)?
Present in the tidyverse

With {tidypolars}:

p1 <- pl$DataFrame(
  x = c("a", "b"),
  y = 1:2
)
p2 <- pl$DataFrame(
  x = c("c", "d"),
  y = 3:4
)

# create an id colum
bind_rows_polars(p1, p2, .id = "id")

gives integers as ids:

shape: (4, 3)
┌─────┬─────┬─────┐
│ id  ┆ x   ┆ y   │
│ --- ┆ --- ┆ --- │
│ i32 ┆ str ┆ i32 │
╞═════╪═════╪═════╡
│ 1   ┆ a   ┆ 1   │
│ 1   ┆ b   ┆ 2   │
│ 2   ┆ c   ┆ 3   │
│ 2   ┆ d   ┆ 4   │
└─────┴─────┴─────┘
@ginolhac
Copy link
Author

ginolhac commented May 16, 2024

It is really a feature request, as I am currently making this workaround to have those meaningful names:

bind_rows_polars(select(human, type) |> mutate(animal = "human"),
                 select(salmon, type) |> mutate(animal = "salmon")) |> 
  count(type, animal) |> 
  collect()

@etiennebacher
Copy link
Owner

Thanks, I didn't know this case. It is now possible with the devel version:

library(dplyr, warn.conflicts = FALSE)
library(tidypolars)

t1 <- tibble(
  x = c("a", "b"),
  y = 1:2
)
t2 <- tibble(
  x = c("c", "d"),
  y = 3:4
)
bind_rows(t1 = t1, tib2 = t2, .id = "id")
#> # A tibble: 4 × 3
#>   id    x         y
#>   <chr> <chr> <int>
#> 1 t1    a         1
#> 2 t1    b         2
#> 3 tib2  c         3
#> 4 tib2  d         4

t1 <- as_polars_df(t1)
t2 <- as_polars_df(t2)
bind_rows_polars(t1 = t1, tib2 = t2, .id = "id")
#> shape: (4, 3)
#> ┌──────┬─────┬─────┐
#> │ id   ┆ x   ┆ y   │
#> │ ---  ┆ --- ┆ --- │
#> │ str  ┆ str ┆ i32 │
#> ╞══════╪═════╪═════╡
#> │ t1   ┆ a   ┆ 1   │
#> │ t1   ┆ b   ┆ 2   │
#> │ tib2 ┆ c   ┆ 3   │
#> │ tib2 ┆ d   ┆ 4   │
#> └──────┴─────┴─────┘

@ginolhac
Copy link
Author

fast and sharp! Wonderful !
I catched another little trick you don't support, let me know if it is asking too much or if I should open a new issue.
In dplyr::select() one can select AND rename at the same time, such as:

select(t1, x, why = y)
# A tibble: 2 × 2
  x       why
  <chr> <int>
1 a         1
2 b         2

with tidypolars, we need to select(t1, x, y) |> rename(why = y)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants