Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hoist doesn't handle duplicate names well #834

Closed
mgirlich opened this issue Dec 13, 2019 · 3 comments · Fixed by #924
Closed

hoist doesn't handle duplicate names well #834

mgirlich opened this issue Dec 13, 2019 · 3 comments · Fixed by #924
Labels
documentation rectangling 🗄️

Comments

@mgirlich
Copy link
Contributor

@mgirlich mgirlich commented Dec 13, 2019

hoist() simply overwrites existing columns and doesn't check that the names of the ... argument are unique:

library(tidyr)
df <- tibble(
  character = c("Toothless", "Dory"),
  metadata = list(
    list(
      species = "dragon",
      color = "black",
      films = c(
        "How to Train Your Dragon",
        "How to Train Your Dragon 2",
        "How to Train Your Dragon: The Hidden World"
       )
    ),
    list(
      species = "blue tang",
      color = "blue",
      films = c("Finding Nemo", "Finding Dory")
    )
  )
)

# doesn't care about duplicate name
hoist(.data = df,
  .col = metadata,
  film = list("films", 1L),
  film = list("films", 3L)
)
#> # A tibble: 2 x 4
#>   character film                 film                             metadata      
#>   <chr>     <chr>                <chr>                            <list>        
#> 1 Toothless How to Train Your D… How to Train Your Dragon: The H… <named list […
#> 2 Dory      Finding Nemo         <NA>                             <named list […

# overwrite existing column
hoist(.data = df,
  .col = metadata,
  character = list("films", 2L)
)
#> # A tibble: 2 x 2
#>   metadata         character                 
#>   <list>           <chr>                     
#> 1 <named list [3]> How to Train Your Dragon 2
#> 2 <named list [3]> Finding Dory

Created on 2019-12-13 by the reprex package (v0.3.0)

@hadley
Copy link
Member

@hadley hadley commented Apr 1, 2020

Yes, hoist() uses the same principle of mutate(). This could be documented better.

@hadley hadley added documentation rectangling 🗄️ labels Apr 1, 2020
@mgirlich
Copy link
Contributor Author

@mgirlich mgirlich commented Apr 2, 2020

What about the first case with duplicate names in the dots?

# doesn't care about duplicate name
hoist(.data = df,
  .col = metadata,
  film = list("films", 1L),
  film = list("films", 3L)
)
#> # A tibble: 2 x 4
#>   character film                 film                             metadata      
#>   <chr>     <chr>                <chr>                            <list>        
#> 1 Toothless How to Train Your D… How to Train Your Dragon: The H… <named list […
#> 2 Dory      Finding Nemo         <NA>                             <named list […

This doesn't follow the logic of mutate and simply produces a duplicate name which isn't very nice. So, I think the names in dots should be unique. Alternatively, hoist() could get .names_repair as argument but this is probably confusing with the mutate() behaviour for existing columns.
I am happy to create a PR when this is decided.

@hadley
Copy link
Member

@hadley hadley commented Apr 2, 2020

Ah yeah, this should definitely be an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation rectangling 🗄️
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants