Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spreading + removing irrelevant columns #572

Closed
jackdolgin opened this issue Mar 6, 2019 · 2 comments

Comments

@jackdolgin
Copy link

commented Mar 6, 2019

Such a big fan of tidyr! (I hope I'm doing this right, I've only posted to stackoverflow)

CO2low <- CO2 %>% filter(conc < 250)

spread(CO2low, conc, uptake) #attempt1

#    Plant        Type  Treatment   95  175
# 1    Qn1      Quebec nonchilled 16.0 30.4
# 2    Qn2      Quebec nonchilled 13.6 27.3
# 3    Qn3      Quebec nonchilled 16.2 32.4

CO2low$randoms <- runif(nrow(CO2low), min=0, max=1)
spread(CO2low, conc, uptake) #attempt2
#   Plant        Type  Treatment      randoms   95  175
# 1    Qn1      Quebec nonchilled 1.539457e-01 16.0   NA
# 2    Qn1      Quebec nonchilled 5.429544e-01   NA 30.4
# 3    Qn2      Quebec nonchilled 1.415618e-01 13.6   NA
# 4    Qn2      Quebec nonchilled 8.917103e-01   NA 27.3

I'd like my second spread attempt to result in the same output as the first spread attempt, and though I imagine I could use 'select(-randoms)', is there an argument in spread that would say something along the lines of, 'Remove any other columns whose values aren't identical between the rows we'd spread over'?

@hadley hadley added this to the v1.0.0 milestone Mar 6, 2019
@hadley

This comment has been minimized.

Copy link
Member

commented Mar 6, 2019

Thanks for filing this issue because I was thinking about it and you've saved me some typing 😄

@hadley

This comment has been minimized.

Copy link
Member

commented Mar 6, 2019

library(tidyr)
library(dplyr, warn.conflicts = FALSE)

CO2low <- CO2 %>% filter(conc < 250)
CO2low$randoms <- runif(nrow(CO2low), min=0, max=1)

CO2low %>% 
  pivot_wide(keys = Plant:Treatment, names_from = conc, values_from = uptake)
#> # A tibble: 12 x 5
#>    Plant Type        Treatment   `95` `175`
#>    <ord> <fct>       <fct>      <dbl> <dbl>
#>  1 Qn1   Quebec      nonchilled  16    30.4
#>  2 Qn2   Quebec      nonchilled  13.6  27.3
#>  3 Qn3   Quebec      nonchilled  16.2  32.4
#>  4 Qc1   Quebec      chilled     14.2  24.1
#>  5 Qc2   Quebec      chilled      9.3  27.3
#>  6 Qc3   Quebec      chilled     15.1  21  
#>  7 Mn1   Mississippi nonchilled  10.6  19.2
#>  8 Mn2   Mississippi nonchilled  12    22  
#>  9 Mn3   Mississippi nonchilled  11.3  19.4
#> 10 Mc1   Mississippi chilled     10.5  14.9
#> 11 Mc2   Mississippi chilled      7.7  11.4
#> 12 Mc3   Mississippi chilled     10.6  18

Created on 2019-03-06 by the reprex package (v0.2.1.9000)

@hadley hadley closed this in d7ff996 Mar 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.