Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fct_lump all with count above/below threshold #142

Closed
RoyalTS opened this issue Jul 24, 2018 · 3 comments · Fixed by #166 or #247
Closed

fct_lump all with count above/below threshold #142

RoyalTS opened this issue Jul 24, 2018 · 3 comments · Fixed by #166 or #247
Labels
feature a feature request or enhancement tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day

Comments

@RoyalTS
Copy link

RoyalTS commented Jul 24, 2018

It would be great if fct_lump() had an additional mode that lumped factor levels not based on the final number of factor levels desired (n) or the proportion of the data that accrues to the level (prop) but based on the raw count for the level (call it count?).

Stealing the example data from #43:

N <- 5
M <- 2 ** N
exp_factor <- factor(rep(1:(N + M), c(2 ** (N:1), rep(1, M))))
table(exp_factor)
## exp_factor
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 
## 32 16  8  4  2  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
## 26 27 28 29 30 31 32 33 34 35 36 37 
##  1  1  1  1  1  1  1  1  1  1  1  1

Desired behavior: forcats::fct_lump(exp_factor, count = 1) would retain only the factor levels 1:5 and Other.

@AshesITR
Copy link

That's a good idea. Maybe naming the argument min_count with comparison >= makes the functionality clearer...

@hadley
Copy link
Member

hadley commented Jan 4, 2019

I think this is a reasonable request, but my main concern is making fct_lump() too complicated. It might be time to start spinning out fct_lump_prop() etc.

@hadley hadley added feature a feature request or enhancement tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day labels Jan 4, 2019
@robinsones
Copy link
Contributor

👋 I'll try it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day
Projects
None yet
4 participants