Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn levels into numbers #18

Closed
hadley opened this issue Aug 16, 2016 · 6 comments
Closed

Turn levels into numbers #18

hadley opened this issue Aug 16, 2016 · 6 comments
Labels
feature a feature request or enhancement

Comments

@hadley
Copy link
Member

hadley commented Aug 16, 2016

levels(gss_cat$rincome)
#> [1] "No answer"      "Don't know"     "Refused"        "$25000 or more" "$20000 - 24999" "$15000 - 19999"
#> [7] "$10000 - 14999" "$8000 to 9999"  "$7000 to 7999"  "$6000 to 6999"  "$5000 to 5999"  "$4000 to 4999" 
#> [13] "$3000 to 3999"  "$1000 to 2999"  "Lt $1000"       "Not applicable"

Would be useful to have some moderately easy way to turn into a numeric vector. It wouldn't be magic (i.e. it wouldn't compute mean of values) - it would work like fct_revalue(), but create numeric vector, and unmentioned levels would become NAs.

@hadley
Copy link
Member Author

hadley commented Aug 17, 2016

or

gss_cat %>% 
  mutate(partyid = fct_recode(partyid,
    "Republican, strong" =           -3,
    "Republican, weak" =             -2,
    "Independent, near republican" = -1,
    "Indepedent" =                    0,
    "Independent, near democrat" =    1,
    "Democrat, weak" =                2,
    "Democrat, strong" =              3
  )) %>% 
  count(partyid)

@hadley
Copy link
Member Author

hadley commented Aug 18, 2016

For this to work naturally the new/old order needs to be flipped :/

@hadley
Copy link
Member Author

hadley commented Aug 18, 2016

@lionel- any thoughts here?

@lionel-
Copy link
Member

lionel- commented Aug 18, 2016

Hmm... Could forcats reuse the dplyr's tools for recoding? Then we have ``-3 = "Republican, strong" which is a bit non-optimal from both the UI and implementation viewpoints but at least we have the natural order and it's consistent with dplyr's recode functions.

Can this be an alias to recode_factor()? A downside is that the recoding code should probably be extracted out of dplyr then, to avoid the heavy dep.

PS: just saw the emojis on the repo page, had a good laugh :)

@satuhelske
Copy link

This would be exactly what I need.

@hadley hadley added the feature a feature request or enhancement label Feb 10, 2018
@hadley
Copy link
Member Author

hadley commented Jan 3, 2023

In dplyr 1.1, this will be dplyr::case_match()

@hadley hadley closed this as completed Jan 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants