Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n argument for step_other #289

Closed
brshallo opened this issue Feb 28, 2019 · 3 comments · Fixed by #361
Closed

n argument for step_other #289

brshallo opened this issue Feb 28, 2019 · 3 comments · Fixed by #361
Labels
tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day

Comments

@brshallo
Copy link

brshallo commented Feb 28, 2019

Would like if threshold argument within step_other() could be specified by an integer (rather than just by proportion), or had a separate n argument to specify the minimum sample size before it is collapsed into the "other" category.
E.g. to specify minimum sample size of 30:

recipe(price ~ clarity + color + carat, data = diamonds) %>% 
  step_other(all_nominal(), n = 30)

I'm curious what the best way of doing this is currently? (Link to hack for specifying consistent sample size to step_other across datasets of different sizes.)

@brshallo
Copy link
Author

Using a step_mutate() with forcats::fct_lump_min() it seems would accomplish this. Though is there a way to apply step_mutate() over all_nominal(), in the way one might with mutate_if()/_at()/_all() ?

@topepo topepo added the tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day label Jun 27, 2019
@topepo
Copy link
Member

topepo commented Jun 28, 2019

Instead of adding a new argument, we could just use threshold and, if it is >1, treat it as a frequency.

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Feb 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants