Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate fct_other functionality into fct_collapse #100

Closed
jmpsteen opened this issue Nov 29, 2017 · 6 comments
Closed

Integrate fct_other functionality into fct_collapse #100

jmpsteen opened this issue Nov 29, 2017 · 6 comments

Comments

@jmpsteen
Copy link

@jmpsteen jmpsteen commented Nov 29, 2017

Thanks for this great package! I'm wondering though whether there would be any possibility of integrating features of fct_other into the fct_collapse function?

I currently need to recode a factor by collapsing a targeted set of levels (say c("A", "B", "C", "D", "E")) into new levels (say c("AB", "CDE")) and I want to recode all non-targeted levels (which usually is quite a long set of levels to manually specify, e.g. all other letters of the alphabet) into an "other" category. In other words: I'ld like fct_collapse to have an additional argument (similar to fct_other's other_level argument) that, when specified, automatically identifies all non-targeted levels (i.e. those not specified in any of the vectors, which in fct_other would be specified in the drop argument) and collapses these into a category with level name specified in this argument.

This feature would be quite helpful, because the only solution I can think of now is quite cumbersome. That is, I have to manually specify all new (collapsed) factor level names (entered in the fct_collapse call) in the keep argument of fct_other, while I assume these could easily be internally identified automatically in the fct_collapse function.

jmpsteen added a commit to jmpsteen/forcats that referenced this issue Nov 29, 2017
possible solution for issue tidyverse#100
@dholstius
Copy link

@dholstius dholstius commented Dec 11, 2017

+1 for this. I often have use for a tabulation where the final element is “(other)”.

Lexically, the leading parenthesis makes “(other)” follow alphanum levels, so I’d suggest that as a default.

@hadley

This comment has been hidden.

@hadley hadley added the reprex label Feb 10, 2018
@BirgerNi
Copy link

@BirgerNi BirgerNi commented Apr 15, 2018

+1 for this.

library(forcats)

set.seed(123)
data <- sample(LETTERS[1:10], 99, replace = TRUE) 


# What I had to do
data %>% 
  fct_collapse(AB = c("A", "B"), CDE = c("C", "D", "E")) %>%
  fct_other(keep = c("AB", "CDE"))
#>  [1] CDE   Other CDE   Other Other AB    Other Other Other CDE   Other
#> [12] CDE   Other Other AB    Other CDE   AB    CDE   Other Other Other
#> [23] Other Other Other Other Other Other CDE   AB    Other Other Other
#> [34] Other AB    CDE   Other CDE   CDE   CDE   AB    CDE   CDE   CDE  
#> [45] AB    AB    CDE   CDE   CDE   Other AB    CDE   Other AB    Other
#> [56] CDE   AB    Other Other CDE   Other AB    CDE   CDE   Other CDE  
#> [67] Other Other Other CDE   Other Other Other AB    CDE   CDE   CDE  
#> [78] Other CDE   AB    CDE   Other CDE   Other AB    CDE   Other Other
#> [89] Other AB    AB    Other CDE   Other CDE   AB    Other AB    CDE  
#> Levels: AB CDE Other


# What I like to do
# data %>%
#   fct_collapse(AB = c("A", "B"), CDE = c("C", "D", "E"), group_other = TRUE)
@batpigandme

This comment has been hidden.

@hadley
Copy link
Member

@hadley hadley commented Jan 4, 2019

Seems like a reasonable feature - the main thing to figure out is the name of the argument

@AmeliaMN
Copy link
Contributor

@AmeliaMN AmeliaMN commented Jan 19, 2019

I'll try this one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants