New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate fct_other functionality into fct_collapse #100

Closed
jmpsteen opened this Issue Nov 29, 2017 · 6 comments

Comments

Projects
None yet
6 participants
@jmpsteen
Copy link

jmpsteen commented Nov 29, 2017

Thanks for this great package! I'm wondering though whether there would be any possibility of integrating features of fct_other into the fct_collapse function?

I currently need to recode a factor by collapsing a targeted set of levels (say c("A", "B", "C", "D", "E")) into new levels (say c("AB", "CDE")) and I want to recode all non-targeted levels (which usually is quite a long set of levels to manually specify, e.g. all other letters of the alphabet) into an "other" category. In other words: I'ld like fct_collapse to have an additional argument (similar to fct_other's other_level argument) that, when specified, automatically identifies all non-targeted levels (i.e. those not specified in any of the vectors, which in fct_other would be specified in the drop argument) and collapses these into a category with level name specified in this argument.

This feature would be quite helpful, because the only solution I can think of now is quite cumbersome. That is, I have to manually specify all new (collapsed) factor level names (entered in the fct_collapse call) in the keep argument of fct_other, while I assume these could easily be internally identified automatically in the fct_collapse function.

jmpsteen added a commit to jmpsteen/forcats that referenced this issue Nov 29, 2017

Update collapse.R
possible solution for issue tidyverse#100
@holstius

This comment has been minimized.

Copy link

holstius commented Dec 11, 2017

+1 for this. I often have use for a tabulation where the final element is “(other)”.

Lexically, the leading parenthesis makes “(other)” follow alphanum levels, so I’d suggest that as a default.

@hadley

This comment was marked as resolved.

Copy link
Member

hadley commented Feb 10, 2018

Can you please provide a minimal reprex (reproducible example)? The goal of a reprex is to make it as easy as possible for me to recreate your problem so that I can fix it: please help me help you!

If you've never heard of a reprex before, start by reading "What is a reprex", and follow the advice further down the page. Please make sure your reprex is created with the reprex package as it gives nicely formatted output and avoids a number of common pitfalls.

@hadley hadley added the reprex label Feb 10, 2018

@BirgerNi

This comment has been minimized.

Copy link

BirgerNi commented Apr 15, 2018

+1 for this.

library(forcats)

set.seed(123)
data <- sample(LETTERS[1:10], 99, replace = TRUE) 


# What I had to do
data %>% 
  fct_collapse(AB = c("A", "B"), CDE = c("C", "D", "E")) %>%
  fct_other(keep = c("AB", "CDE"))
#>  [1] CDE   Other CDE   Other Other AB    Other Other Other CDE   Other
#> [12] CDE   Other Other AB    Other CDE   AB    CDE   Other Other Other
#> [23] Other Other Other Other Other Other CDE   AB    Other Other Other
#> [34] Other AB    CDE   Other CDE   CDE   CDE   AB    CDE   CDE   CDE  
#> [45] AB    AB    CDE   CDE   CDE   Other AB    CDE   Other AB    Other
#> [56] CDE   AB    Other Other CDE   Other AB    CDE   CDE   Other CDE  
#> [67] Other Other Other CDE   Other Other Other AB    CDE   CDE   CDE  
#> [78] Other CDE   AB    CDE   Other CDE   Other AB    CDE   Other Other
#> [89] Other AB    AB    Other CDE   Other CDE   AB    Other AB    CDE  
#> Levels: AB CDE Other


# What I like to do
# data %>%
#   fct_collapse(AB = c("A", "B"), CDE = c("C", "D", "E"), group_other = TRUE)
@batpigandme

This comment was marked as resolved.

Copy link
Member

batpigandme commented Apr 15, 2018

@BirgerNi thanks for the reprex.
FYI, re +1 (for @holstius as well), see our new article re. upvoting issues/FRs on github 👍:
https://www.tidyverse.org/articles/2018/04/upvoting-issues/

@batpigandme batpigandme removed the reprex label Apr 15, 2018

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Jan 4, 2019

Seems like a reasonable feature - the main thing to figure out is the name of the argument

@AmeliaMN

This comment has been minimized.

Copy link
Contributor

AmeliaMN commented Jan 19, 2019

I'll try this one!

AmeliaMN added a commit to AmeliaMN/forcats that referenced this issue Jan 19, 2019

AmeliaMN added a commit to AmeliaMN/forcats that referenced this issue Jan 19, 2019

@hadley hadley closed this in #160 Jan 19, 2019

hadley added a commit that referenced this issue Jan 19, 2019

Add group_other option to fct_collapse, to allow for optional recodin…
…g of all non-targeted levels into an "Other" category (#160)

Fixes #100
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment