Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for pivot_longer() - ptypes vs transform #1364

Closed
rjake opened this issue Jun 2, 2022 · 2 comments · Fixed by #1366
Closed

Documentation for pivot_longer() - ptypes vs transform #1364

rjake opened this issue Jun 2, 2022 · 2 comments · Fixed by #1366

Comments

@rjake
Copy link

rjake commented Jun 2, 2022

I love the new(ish) pivot_*() functions. Although I use them a lot, I don't understand when to use ptypes and when to use transform. Could the documentation include examples for each? Data class mismatches are the most common error I face and I found suggestions to use each of these when looking on stackoverflow

library(tidyverse)

# no
mpg |> pivot_longer(cols = -year, values_ptypes = as.character())
mpg |> pivot_longer(cols = -year, values_ptypes = character())
mpg |> pivot_longer(cols = -year, values_ptypes = list(value = "character"))
# Error: Can't convert <double> to <character>.

# yes
mpg |> pivot_longer(cols = -year, values_transform = as.character)
# year  name          value   
# <int> <chr>         <chr>   
# 1999  manufacturer  audi    
# 1999  model         a4      
# 1999  displ         1.8     
# 1999  cyl           4   
@DavisVaughan
Copy link
Member

I think we do a pretty good job of documenting them already

The names_ptypes/values_ptypes docs say:

Use these arguments if you want to confirm that the created columns are the types that you expect. Note that if you want to change (instead of confirm) the types of specific columns, you should use names_transform or values_transform instead.

The names_transform/values_transform docs say:

Use these arguments if you need to change the types of specific columns. For example, names_transform = list(week = as.integer) would convert a character variable called week to an integer.

And then in the examples section we point you to the pivot vignette, and this section has an example that uses names_transform https://tidyr.tidyverse.org/articles/pivot.html#billboard


So the _ptypes arguments are mainly for simply checking that the type really is the one you think it is. In your case, the values were double vectors when you "expected" them to be character based on your use of values_ptypes, so you got an error.

The _transform arguments are for explicitly converting to a type that you want, even if the data didn't start out that way. as.character is the function you use to perform the conversion to character.

@DavisVaughan
Copy link
Member

That error message could be improved though. It is supposed to tell you the column it can't convert to character.

library(tidyr)

df <- tibble(x = 1)

# Should say:
# Can't convert `x` <double> to <character>.
pivot_longer(
  df, 
  cols = x, 
  names_to = "name",
  values_to = "value",
  values_ptypes = list(value = character())
)
#> Error:
#> ! Can't convert <double> to <character>.

Created on 2022-06-02 by the reprex package (v2.0.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants