Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pivot_longer changes unpivoted date-integer to date-double #1356

Closed
jonspring opened this issue Apr 23, 2022 · 1 comment
Closed

pivot_longer changes unpivoted date-integer to date-double #1356

jonspring opened this issue Apr 23, 2022 · 1 comment

Comments

@jonspring
Copy link

jonspring commented Apr 23, 2022

I learned today that dates in R can sometimes be represented as integers, not just as doubles. This can occur like in the example below with seq, which changes the date-double into a date-integer. It seems like an unorthodox data type and might be thought of as a bug of seq that it came to be at all. Unexpectedly, pivot_longer converts these date-integers into the typical date-double, and I couldn't figure out a way to undo that.

I'm not sure where this would cause problems for someone, but I was surprised to see one of my answers on SO did not exactly match the original data because of this subtle change. https://stackoverflow.com/questions/71972396/how-to-reduce-processing-time-of-a-code-in-r

set.seed(42)
df <- data.frame(date_dbl = as.Date( "2022-04-23"),  
                 date_int = seq(as.Date("2022-04-23"), length.out=2, by=1),
                 a = 1:2,
                 b = 3:4)
df
typeof(df$date_dbl)
# [1] "double"
typeof(df$date_int)
# [1] "integer"

df_piv <- tidyr::pivot_longer(df, -(1:2))
typeof(df_piv$date_dbl)
# [1] "double"
typeof(df_piv$date_int)
# [1] "double"
@jonspring jonspring changed the title pivot_longer changes type of unpivoted date pivot_longer changes unpivoted date-integer to date-double Apr 25, 2022
@DavisVaughan
Copy link
Member

This is actually intentional. Date vectors are really meant to be double (for better or worse), and it has been a longstanding bug that they are sometimes returned as integer by base R functions. seq.Date() was actually fixed in R 4.2, and now returns a double vector:

x <- seq(as.Date("2022-04-23"), length.out=2, by=1)

typeof(x)
#> [1] "double"

Somewhere under the hood, vctrs is "normalizing" or "fixing" the integer date vector and turning it into a double vector. We generally want this behavior, so I would call this expected.

Unfortunately there isn't a great place to document this, but I expect it to come up less often now that R 4.2+ will generate integer Date vectors less often

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants