Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upNo formula and adding a step causes all variables to have role predictor role #296
Comments
|
Two things. First this is a bug, so thanks for reporting! Second, generally you should be using suppressPackageStartupMessages({
library(recipes)
library(nlme)
})
obj_rec <- recipe(x = as.data.frame(Orthodont)) %>%
update_role(distance, new_role = 'outcome') %>%
update_role(age, new_role = 'predictor') %>%
update_role(Sex, new_role = 'predictor') %>%
step_dummy(Sex)
obj_prep <- prep(obj_rec, training = Orthodont)
juice(obj_prep, all_predictors())
#> # A tibble: 108 x 3
#> age Subject Sex_Female
#> <dbl> <ord> <dbl>
#> 1 8 M01 0
#> 2 10 M01 0
#> 3 12 M01 0
#> 4 14 M01 0
#> 5 8 M02 0
#> 6 10 M02 0
#> 7 12 M02 0
#> 8 14 M02 0
#> 9 8 M03 0
#> 10 10 M03 0
#> # … with 98 more rowsCreated on 2019-03-15 by the reprex package (v0.2.1.9000) This is what I mean by the fact that you should not be adding roles when suppressPackageStartupMessages({
library(recipes)
library(nlme)
})
obj_rec <- recipe(x = as.data.frame(Orthodont)) %>%
add_role(distance, new_role = 'outcome') %>%
add_role(age, new_role = 'predictor') %>%
add_role(Sex, new_role = 'predictor') %>%
step_dummy(Sex)
summary(obj_rec)
#> # A tibble: 7 x 4
#> variable type role source
#> <chr> <chr> <chr> <chr>
#> 1 distance numeric <NA> original
#> 2 distance numeric outcome original
#> 3 age numeric <NA> original
#> 4 age numeric predictor original
#> 5 Subject nominal <NA> original
#> 6 Sex nominal <NA> original
#> 7 Sex nominal predictor originalCreated on 2019-03-15 by the reprex package (v0.2.1.9000) I've opened a PR #297 to fix this issue. With that fixed version installed we get the desired behavior: suppressPackageStartupMessages({
library(recipes)
library(nlme)
})
obj_rec <- recipe(x = as.data.frame(Orthodont)) %>%
update_role(distance, new_role = 'outcome') %>%
update_role(age, new_role = 'predictor') %>%
update_role(Sex, new_role = 'predictor') %>%
step_dummy(Sex)
obj_prep <- prep(obj_rec, training = Orthodont)
juice(obj_prep, all_predictors())
#> # A tibble: 108 x 2
#> age Sex_Female
#> <dbl> <dbl>
#> 1 8 0
#> 2 10 0
#> 3 12 0
#> 4 14 0
#> 5 8 0
#> 6 10 0
#> 7 12 0
#> 8 14 0
#> 9 8 0
#> 10 10 0
#> # … with 98 more rowsCreated on 2019-03-15 by the reprex package (v0.2.1.9000) |
Hi,
Firstly, thank you for this wonderful package.
I have query regarding the behaviour of adding a step (example below uses step_dummy). When I don't add a formula and perform juice(prepped object, all_predictors()), this returns all variables. It seems by not having a role, then adding step it makes all variables have a role of predictor.
Please see the examples below. I've tried to illustrate the issue in the first one, with expected behaviour in the next two examples.
Please excuse me if I've misused the package.
Best wishes,
Oli