Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New role selectors #993

Merged
merged 49 commits into from
Oct 20, 2022
Merged

New role selectors #993

merged 49 commits into from
Oct 20, 2022

Conversation

EmilHvitfeldt
Copy link
Member

@EmilHvitfeldt EmilHvitfeldt commented May 26, 2022

This PR aims to close #942, close #669 and helps with #193.

It does this by expanding the role selectors we have available.

without addressing the *_predictors() variants we used to have

  • all_numeric()
  • all_nominal()

This PR makes it so we have

  • all_numeric()
    • all_integer()
    • all_double()
  • all_nominal()
    • all_ordered()
    • all_unordered()
      • all_string()
      • all_factor()
  • all_logical()
  • all_date()
  • all_datetime()

The structure above is to indicate that all_numeric() captures both integers and doubles.

I made the changes to be backwards compatible. The changes are many in a way such that the type variable in var_info is a list column instead of a character column.

TODO:

  • Fix tests
  • Document that using -all_outcomes() is an anti-pattern
  • update steps to work with changes, some steps uses testing like this which doesn't work anymore
  • add alll_strings and all_factors

@topepo
Copy link
Member

topepo commented May 29, 2022

For this to work well, can you take a look at all of the steps that produce indicators (or other integers) and make sure that they all produce actual integers? Otherwise, we'll just beed fielding more issues when this PR is merged.

R/misc.R Outdated Show resolved Hide resolved
Merge commit '885b2c2f89acfd9181ffc802978d852edb89b47d'

Conflicts:
	NEWS.md
	tests/testthat/_snaps/R4.1/selections.md
	tests/testthat/_snaps/R4.2/selections.md
@EmilHvitfeldt
Copy link
Member Author

The failing build right now is related to some new changes in r-devel related to dates... https://github.com/tidymodels/recipes/actions/runs/3192498090/jobs/5210042479#step:6:234

should be dealt with in a separate PR

Copy link
Member

@topepo topepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed it but can you save a few existing recipes and make sure that it is backwards compatible in unit tests

R/get_types_recipes.R Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
R/get_types_recipes.R Outdated Show resolved Hide resolved
R/get_types_recipes.R Outdated Show resolved Hide resolved
R/selections.R Outdated Show resolved Hide resolved
R/selections.R Outdated Show resolved Hide resolved
@EmilHvitfeldt
Copy link
Member Author

This change is backwards compatible in the sense that you an old unprepped recipe is identical to a similarly constructed recipe after being prepped.

# 1.0.1
library(recipes)
rec_spec <- recipe(mpg ~ ., data = mtcars) %>%
  step_poly(disp) %>%
  step_normalize(all_numeric_predictors())

readr::write_rds(rec_spec, "rec_spec.rds")

# This PR
library(recipes)
rec_spec <- readr::read_rds("rec_spec.rds")

new_rec_spec <- recipe(mpg ~ ., data = mtcars) %>%
  step_poly(disp) %>%
  step_normalize(all_numeric_predictors())

identical(prep(rec_spec), prep(new_rec_spec))

You can also bake an old prepped recipe in the new version no problem.

# 1.0.1
library(recipes)
rec_spec <- recipe(mpg ~ ., data = mtcars) %>%
  step_poly(disp) %>%
  step_normalize(all_numeric_predictors()) %>%
  prep()

readr::write_rds(rec_spec, "rec_spec.rds")

# This PR
library(recipes)
rec_spec <- readr::read_rds("rec_spec.rds")

new_rec_spec <- recipe(mpg ~ ., data = mtcars) %>%
  step_poly(disp) %>%
  step_normalize(all_numeric_predictors())  %>%
  prep()

identical(bake(rec_spec, mtcars), bake(new_rec_spec, mtcars))

Where the it brakes down if you try to add steps to a old recipes that has been prepped, and at least one of the steps produced a new variable.

# 1.0.1
library(recipes)
rec_spec <- recipe(mpg ~ ., data = mtcars) %>%
  step_poly(disp) %>%
  step_normalize(all_numeric_predictors()) %>%
  prep()

readr::write_rds(rec_spec, "rec_spec.rds")

# This PR
library(recipes)
rec_spec <- readr::read_rds("rec_spec.rds")

# Doesn't work
rec_spec %>% 
  step_date(all_date()) %>% #assuming that mtcars had a date variable
  prep() %>%
  bake(data = mtcars)

@topepo topepo merged commit 4b57766 into main Oct 20, 2022
@topepo topepo deleted the new-role-selectors branch October 20, 2022 20:11
@github-actions
Copy link

github-actions bot commented Nov 4, 2022

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Nov 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Selectors for ordered factors (ordinal data) Add table to has_role() documentation
4 participants