-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change printing for step_impute_knn()
#837
Comments
There are a couple of things going on here that may be contributing to some confusion. When you do the data splitting, you end up with data in If instead we only add two library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
iris <- tibble(iris)
iris[1, 2] <- NA_real_
iris[1, 3] <- NA_real_
iris
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 NA NA 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # … with 140 more rows
base_rec <- recipe(Sepal.Length ~ ., data = iris) %>%
step_impute_knn(Petal.Length)
prep(base_rec)
#> Recipe
#>
#> Inputs:
#>
#> role #variables
#> outcome 1
#> predictor 4
#>
#> Training data contained 150 data points and 1 incomplete row.
#>
#> Operations:
#>
#> K-nearest neighbor imputation for Sepal.Width, Petal.Width, Species [trained]
prep(base_rec) %>% bake(new_data = NULL)
#> # A tibble: 150 × 5
#> Sepal.Width Petal.Length Petal.Width Species Sepal.Length
#> <dbl> <dbl> <dbl> <fct> <dbl>
#> 1 NA 1.44 0.2 setosa 5.1
#> 2 3 1.4 0.2 setosa 4.9
#> 3 3.2 1.3 0.2 setosa 4.7
#> 4 3.1 1.5 0.2 setosa 4.6
#> 5 3.6 1.4 0.2 setosa 5
#> 6 3.9 1.7 0.4 setosa 5.4
#> 7 3.4 1.4 0.3 setosa 4.6
#> 8 3.4 1.5 0.2 setosa 5
#> 9 2.9 1.4 0.2 setosa 4.4
#> 10 3.1 1.5 0.1 setosa 4.9
#> # … with 140 more rows Created on 2021-10-20 by the reprex package (v2.0.1) The The printing is confusing because it is telling you the things that are possibly being used for imputation and not what is being imputed. Maybe we can fix that here: Line 275 in ca9f84b
We could just change "for" to "with"? Or change which variables are printed out and do the ones that are being imputed rather than the ones that are being imputed with. |
Hi, thanks, for an explanation. |
No, we'll leave this open and close it when we change the printing! 👍 |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue. |
Step_impute_knn - ignore variable that should be imputed, instead of this impute value in all possible numeric variables:
prep(base_rec)
K-nearest neighbor imputation for Sepal.Width, Petal.Width, Species [trained]
whereas it should be only Petal.Length.
Bests
Seweryn
The text was updated successfully, but these errors were encountered: