Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A dedicated verb to move columns within a table #4598

Closed
mikmart opened this issue Oct 19, 2019 · 8 comments
Closed

A dedicated verb to move columns within a table #4598

mikmart opened this issue Oct 19, 2019 · 8 comments
Labels
feature verbs 🏃‍♀️

Comments

@mikmart
Copy link
Contributor

mikmart commented Oct 19, 2019

This topic has been brought up before by @grayskripko in #2047 (comment). @moodymudskipper has also shared an implementation in #2047 (comment), born from a Stack Overflow question. It's my hope that separating this from #2047 into an issue of its own could reinvigorate a more focused discussion about a specialized verb for moving columns, outside the specific context of mutate().

It's often useful to reorder columns within a table: for example to group related columns together, or to bring identifier/grouping variables to the front to emphasize the structure of the data.

Crafting a call to select() to accomplish the job often takes your focus off the analysis and into thinking about the mechanics of moving columns to the right place while making sure not to accidentally drop any of them. The resulting calls are often long, unintuitive, and can contain duplication of column names or references to columns that aren't directly related to the operation. Later, when reading the code, it can take a while to decipher what's happening. Consider, for example:

select(iris, 1:(Petal.Width - 1), -Sepal.Width, Sepal.Width, everything())

It does make sense after you think about it, given that you are familiar with how tidyselect works; it describes the mechanics, but it doesn't do a very good job of communicating intent. Contrast with:

move(iris, Sepal.Width, .before = Petal.Width)

It's concise, the intent is instantly clear, matches how you likely thought about the operation in the first place, and leaves the mechanics for the function implementation to handle.

Having a dedicated verb to move columns would be a big quality of life improvement that would make analysis and reporting flow more naturally.


PS. If this is something deemed a good fit for dplyr, I also have a branch from a little while back with an initial implementation that I could turn into a PR. Here's some examples:

# remotes::install_github("mikmart/dplyr@move")
library(dplyr, warn.conflicts = FALSE, lib.loc = "~/tmp")

(iris <- as_tibble(head(iris))) # so it prints a little nicer
#> # A tibble: 6 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#> 1          5.1         3.5          1.4         0.2 setosa 
#> 2          4.9         3            1.4         0.2 setosa 
#> 3          4.7         3.2          1.3         0.2 setosa 
#> 4          4.6         3.1          1.5         0.2 setosa 
#> 5          5           3.6          1.4         0.2 setosa 
#> 6          5.4         3.9          1.7         0.4 setosa

# Basic usage
move(iris, Sepal.Width, .before = Petal.Width)
#> # A tibble: 6 x 5
#>   Sepal.Length Petal.Length Sepal.Width Petal.Width Species
#>          <dbl>        <dbl>       <dbl>       <dbl> <fct>  
#> 1          5.1          1.4         3.5         0.2 setosa 
#> 2          4.9          1.4         3           0.2 setosa 
#> 3          4.7          1.3         3.2         0.2 setosa 
#> 4          4.6          1.5         3.1         0.2 setosa 
#> 5          5            1.4         3.6         0.2 setosa 
#> 6          5.4          1.7         3.9         0.4 setosa

# Several columns selected and as reference
move(iris, starts_with("Sepal"), .after = starts_with("Petal"))
#> # A tibble: 6 x 5
#>   Petal.Length Petal.Width Sepal.Length Sepal.Width Species
#>          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#> 1          1.4         0.2          5.1         3.5 setosa 
#> 2          1.4         0.2          4.9         3   setosa 
#> 3          1.3         0.2          4.7         3.2 setosa 
#> 4          1.5         0.2          4.6         3.1 setosa 
#> 5          1.4         0.2          5           3.6 setosa 
#> 6          1.7         0.4          5.4         3.9 setosa

# Move a variable to first
move(iris, Species, .before = everything())
#> # A tibble: 6 x 5
#>   Species Sepal.Length Sepal.Width Petal.Length Petal.Width
#>   <fct>          <dbl>       <dbl>        <dbl>       <dbl>
#> 1 setosa           5.1         3.5          1.4         0.2
#> 2 setosa           4.9         3            1.4         0.2
#> 3 setosa           4.7         3.2          1.3         0.2
#> 4 setosa           4.6         3.1          1.5         0.2
#> 5 setosa           5           3.6          1.4         0.2
#> 6 setosa           5.4         3.9          1.7         0.4

# Move a variable to last
move(iris, Sepal.Length, Sepal.Width, .after = last_col())
#> # A tibble: 6 x 5
#>   Petal.Length Petal.Width Species Sepal.Length Sepal.Width
#>          <dbl>       <dbl> <fct>          <dbl>       <dbl>
#> 1          1.4         0.2 setosa           5.1         3.5
#> 2          1.4         0.2 setosa           4.9         3  
#> 3          1.3         0.2 setosa           4.7         3.2
#> 4          1.5         0.2 setosa           4.6         3.1
#> 5          1.4         0.2 setosa           5           3.6
#> 6          1.7         0.4 setosa           5.4         3.9

Created on 2019-10-19 by the reprex package (v0.3.0)

@moodymudskipper
Copy link

moodymudskipper commented Nov 7, 2019

I did this :

#' Move column or selection of columns
#'
#' Column(s) described by \code{cols} are moved before (default) or after the reference 
#'   column described by \code{ref}
#'
#' @param data A \code{data.frame}
#' @param cols unquoted column name or numeric or selection of columns using a select helper
#' @param ref unquoted column name
#' @param side \code{"before"} or \code{"after"}
#'
#' @return A data.frame with reordered columns
#' @export
#'
#' @examples
#' iris2 <- head(iris,2)
#' move(iris2, Species, Sepal.Width)
#' move(iris2, Species, Sepal.Width, "after")
#' move(iris2, 5, 2)
#' move(iris2, 4:5, 2)
#' move(iris2, one_of("Sepal.Width","Species"), Sepal.Width)
#' move(iris2, starts_with("Petal"), Sepal.Width)
move <- function(data, cols, ref, side = c("before","after")){
  if(! requireNamespace("dplyr")) 
    stop("Make sure package 'dplyr' is installed to use function 'move'")
  side <- match.arg(side)
  cols <- rlang::enquo(cols)
  ref  <- rlang::enquo(ref)
  if(side == "before") 
    dplyr::select(data,1:!!ref,-!!ref,-!!cols,!!cols,dplyr::everything()) 
  else
    dplyr::select(data,1:!!ref,-!!cols,!!cols,dplyr::everything())
}

https://stackoverflow.com/questions/52096919/move-a-column-conveniently

@mikmart
Copy link
Contributor Author

mikmart commented Nov 7, 2019

Thanks for mentioning these! I saw them when going through #2047, and in retrospect, I should have done a better job with explaining the background when I opened this issue. I've updated the leading paragraph in an attempt to clarify the situation.

@hadley
Copy link
Member

hadley commented Dec 10, 2019

This seems like a good idea, and from a quick skim, @mikmart's implementation seems solid. But I don't like the name: move() is a bit too short and a bit too generic — would you mind brainstorming a few other names?

@japhir
Copy link

japhir commented Dec 31, 2019

would you mind brainstorming a few other names?

What about reorder? Would seem most intuitive to me as a non-native speaker.

@moodymudskipper
Copy link

moodymudskipper commented Jan 2, 2020

move_cols() ? It looks like bind_cols() and is quite explicit.

or graft(), transplant(), revamp(), reorganize(), reshuffle()

@hadley
Copy link
Member

hadley commented Jan 6, 2020

This function would be similar to select() and rename() so I think ideally it would be two syllables, 4-7 letters, and start with a letter in the latter part of alphabet.

@hadley
Copy link
Member

hadley commented Jan 14, 2020

Now that the dominant select() metaphor is logical indices, a dedicated move() verb makes even more sense.

@hadley
Copy link
Member

hadley commented Jan 14, 2020

Hmmmm rotate() meets my criteria.

@hadley hadley closed this as completed in 33d9fb4 Jan 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature verbs 🏃‍♀️
Projects
None yet
Development

No branches or pull requests

4 participants