Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pivot_longer() #48

Closed
nathaneastwood opened this issue Aug 15, 2020 · 1 comment
Closed

Add pivot_longer() #48

nathaneastwood opened this issue Aug 15, 2020 · 1 comment
Labels
feature request New feature or request

Comments

@nathaneastwood
Copy link
Owner

No description provided.

@nathaneastwood nathaneastwood added the feature request New feature or request label Aug 15, 2020
@nathaneastwood nathaneastwood changed the title [FEAT] Add pivot_longer() Add pivot_longer() Nov 24, 2020
@nathaneastwood nathaneastwood changed the title Add pivot_longer() Add pivot_longer() Jan 19, 2021
@grantmcdermott
Copy link

grantmcdermott commented Nov 14, 2021

I'm sure you've already considered this, but a simple proof-of-concept using stack():

pivot_longer = 
  function(
    data,
    cols,
    names_to = "name",
    names_prefix = NULL,
    # names_sep = NULL,
    # names_pattern = NULL,
    # names_ptypes = list(),
    # names_transform = list(),
    # names_repair = "check_unique",
    values_to = "value",
    # values_drop_na = FALSE,
    # values_ptypes = list(),
    # values_transform = list(),
    ...
  ){
    cols = deparse(substitute(cols))
    if (grepl("^!|^-", cols)) {
      cols = setdiff(names(data), gsub("^!|^-", "", cols))
    }
    ocols = setdiff(names(data), cols)
    stacked_data = stack(data[, cols])[, 2:1]
    ret = cbind(data[, ocols], setNames(stacked_data, c(names_to, values_to)))
    if (!is.null(names_prefix)) {
      ret[[names_to]] = gsub(paste0("^", names_prefix), "", ret[[names_to]])
    }
    ret = ret[order(ret[, ocols[1]]), ]
    rownames(ret) = 1:nrow(ret)
    ret
  }

#
## First example from ?tidy::pivot_longer
#

data('relig_income', package = 'tidyr')

## This version
relig_income |>
  pivot_longer(!religion, names_to = "income", values_to = "count") |>
  head(10)
#>    religion             income count
#> 1  Agnostic              <$10k    27
#> 2  Agnostic            $10-20k    34
#> 3  Agnostic            $20-30k    60
#> 4  Agnostic            $30-40k    81
#> 5  Agnostic            $40-50k    76
#> 6  Agnostic            $50-75k   137
#> 7  Agnostic           $75-100k   122
#> 8  Agnostic          $100-150k   109
#> 9  Agnostic              >150k    84
#> 10 Agnostic Don't know/refused    96

## tidyr version
relig_income |>
  tidyr::pivot_longer(!religion, names_to = "income", values_to = "count") |>
  data.frame() |> head(10)
#>    religion             income count
#> 1  Agnostic              <$10k    27
#> 2  Agnostic            $10-20k    34
#> 3  Agnostic            $20-30k    60
#> 4  Agnostic            $30-40k    81
#> 5  Agnostic            $40-50k    76
#> 6  Agnostic            $50-75k   137
#> 7  Agnostic           $75-100k   122
#> 8  Agnostic          $100-150k   109
#> 9  Agnostic              >150k    84
#> 10 Agnostic Don't know/refused    96

#
## Contrived names_prefix useage
#

## This version
relig_income[, c(1, 3:9)] |>
  pivot_longer(!religion, names_to = "income", values_to = "count", names_prefix = "\\$") |>
  head(10)
#>    religion   income count
#> 1  Agnostic   10-20k    34
#> 2  Agnostic   20-30k    60
#> 3  Agnostic   30-40k    81
#> 4  Agnostic   40-50k    76
#> 5  Agnostic   50-75k   137
#> 6  Agnostic  75-100k   122
#> 7  Agnostic 100-150k   109
#> 8   Atheist   10-20k    27
#> 9   Atheist   20-30k    37
#> 10  Atheist   30-40k    52

## tidyr version
relig_income[, c(1, 3:9)] |>
  tidyr::pivot_longer(!religion, names_to = "income", values_to = "count", names_prefix = "\\$") |>
  data.frame() |> head(10)
#>    religion   income count
#> 1  Agnostic   10-20k    34
#> 2  Agnostic   20-30k    60
#> 3  Agnostic   30-40k    81
#> 4  Agnostic   40-50k    76
#> 5  Agnostic   50-75k   137
#> 6  Agnostic  75-100k   122
#> 7  Agnostic 100-150k   109
#> 8   Atheist   10-20k    27
#> 9   Atheist   20-30k    37
#> 10  Atheist   30-40k    52

Created on 2021-11-14 by the reprex package (v2.0.1)

Two asides:

  1. I have a strong suspicion that going with stack/unstack is ultimately going to be easier than reshape. The latter is (a) very particular about its input form and arguments, and (b) you'll probably end up having to do some binds and internal manipulation anyway once you start invoking the additional pivot_* arguments.
  2. The main complication for extending this proof of concept is handling the NSE. But I believe(?) you've already developed an internal system for deparsing NSE vectors etc. So, that could make things a lot easier.

EDIT: Added names_prefix arg and example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants