Add pivot_longer() #48

nathaneastwood · 2020-08-15T13:44:09Z

No description provided.

grantmcdermott · 2021-11-14T23:34:05Z

I'm sure you've already considered this, but a simple proof-of-concept using stack():

pivot_longer = 
  function(
    data,
    cols,
    names_to = "name",
    names_prefix = NULL,
    # names_sep = NULL,
    # names_pattern = NULL,
    # names_ptypes = list(),
    # names_transform = list(),
    # names_repair = "check_unique",
    values_to = "value",
    # values_drop_na = FALSE,
    # values_ptypes = list(),
    # values_transform = list(),
    ...
  ){
    cols = deparse(substitute(cols))
    if (grepl("^!|^-", cols)) {
      cols = setdiff(names(data), gsub("^!|^-", "", cols))
    }
    ocols = setdiff(names(data), cols)
    stacked_data = stack(data[, cols])[, 2:1]
    ret = cbind(data[, ocols], setNames(stacked_data, c(names_to, values_to)))
    if (!is.null(names_prefix)) {
      ret[[names_to]] = gsub(paste0("^", names_prefix), "", ret[[names_to]])
    }
    ret = ret[order(ret[, ocols[1]]), ]
    rownames(ret) = 1:nrow(ret)
    ret
  }

#
## First example from ?tidy::pivot_longer
#

data('relig_income', package = 'tidyr')

## This version
relig_income |>
  pivot_longer(!religion, names_to = "income", values_to = "count") |>
  head(10)
#>    religion             income count
#> 1  Agnostic              <$10k    27
#> 2  Agnostic            $10-20k    34
#> 3  Agnostic            $20-30k    60
#> 4  Agnostic            $30-40k    81
#> 5  Agnostic            $40-50k    76
#> 6  Agnostic            $50-75k   137
#> 7  Agnostic           $75-100k   122
#> 8  Agnostic          $100-150k   109
#> 9  Agnostic              >150k    84
#> 10 Agnostic Don't know/refused    96

## tidyr version
relig_income |>
  tidyr::pivot_longer(!religion, names_to = "income", values_to = "count") |>
  data.frame() |> head(10)
#>    religion             income count
#> 1  Agnostic              <$10k    27
#> 2  Agnostic            $10-20k    34
#> 3  Agnostic            $20-30k    60
#> 4  Agnostic            $30-40k    81
#> 5  Agnostic            $40-50k    76
#> 6  Agnostic            $50-75k   137
#> 7  Agnostic           $75-100k   122
#> 8  Agnostic          $100-150k   109
#> 9  Agnostic              >150k    84
#> 10 Agnostic Don't know/refused    96

#
## Contrived names_prefix useage
#

## This version
relig_income[, c(1, 3:9)] |>
  pivot_longer(!religion, names_to = "income", values_to = "count", names_prefix = "\\$") |>
  head(10)
#>    religion   income count
#> 1  Agnostic   10-20k    34
#> 2  Agnostic   20-30k    60
#> 3  Agnostic   30-40k    81
#> 4  Agnostic   40-50k    76
#> 5  Agnostic   50-75k   137
#> 6  Agnostic  75-100k   122
#> 7  Agnostic 100-150k   109
#> 8   Atheist   10-20k    27
#> 9   Atheist   20-30k    37
#> 10  Atheist   30-40k    52

## tidyr version
relig_income[, c(1, 3:9)] |>
  tidyr::pivot_longer(!religion, names_to = "income", values_to = "count", names_prefix = "\\$") |>
  data.frame() |> head(10)
#>    religion   income count
#> 1  Agnostic   10-20k    34
#> 2  Agnostic   20-30k    60
#> 3  Agnostic   30-40k    81
#> 4  Agnostic   40-50k    76
#> 5  Agnostic   50-75k   137
#> 6  Agnostic  75-100k   122
#> 7  Agnostic 100-150k   109
#> 8   Atheist   10-20k    27
#> 9   Atheist   20-30k    37
#> 10  Atheist   30-40k    52

^{Created on 2021-11-14 by the reprex package (v2.0.1)}

Two asides:

I have a strong suspicion that going with stack/unstack is ultimately going to be easier than reshape. The latter is (a) very particular about its input form and arguments, and (b) you'll probably end up having to do some binds and internal manipulation anyway once you start invoking the additional pivot_* arguments.
The main complication for extending this proof of concept is handling the NSE. But I believe(?) you've already developed an internal system for deparsing NSE vectors etc. So, that could make things a lot easier.

EDIT: Added names_prefix arg and example.

nathaneastwood added the feature request New feature or request label Aug 15, 2020

nathaneastwood changed the title ~~[FEAT] Add pivot_longer()~~ Add pivot_longer() Nov 24, 2020

nathaneastwood changed the title ~~Add pivot_longer()~~ Add pivot_longer() Jan 19, 2021

grantmcdermott mentioned this issue Jul 28, 2022

Add pivot_wider() #47

Closed

etiennebacher mentioned this issue Jul 28, 2022

Implement pivot_wider and pivot_longer #98

Closed

nathaneastwood closed this as completed Jul 31, 2022

This was referenced Oct 7, 2022

Rewrite reshaping functions to improve performance easystats/datawizard#285

Merged

Update pivot_ functions for performance #117

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pivot_longer() #48

Add pivot_longer() #48

nathaneastwood commented Aug 15, 2020

grantmcdermott commented Nov 14, 2021 •

edited

Add pivot_longer() #48

Add pivot_longer() #48

Comments

nathaneastwood commented Aug 15, 2020

grantmcdermott commented Nov 14, 2021 • edited

grantmcdermott commented Nov 14, 2021 •

edited