Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select_by_*() silently ignores ordering selection if it is set as a string #530

Open
EmilHvitfeldt opened this issue Jul 26, 2022 · 1 comment
Labels
bug an unexpected problem or unintended behavior

Comments

@EmilHvitfeldt
Copy link
Member

If you set "K" instead of K, it creates and sorts by the constant string "K" effectively ignoring it. This should ideally throw an warning/error

library(tune)
data("example_ames_knn")

select_by_one_std_err(ames_grid_search, metric = "rmse", K)
#> # A tibble: 1 × 13
#>       K weigh…¹ dist_…²   lon   lat .metric .esti…³   mean     n std_err .config
#>   <int> <chr>     <dbl> <int> <int> <chr>   <chr>    <dbl> <int>   <dbl> <chr>  
#> 1     5 rank      0.411     2     7 rmse    standa… 0.0740    10 0.00328 Prepro…
#> # … with 2 more variables: .best <dbl>, .bound <dbl>, and abbreviated variable
#> #   names ¹​weight_func, ²​dist_power, ³​.estimator
#> # ℹ Use `colnames()` to see all variable names
select_by_one_std_err(ames_grid_search, metric = "rmse", "K")
#> # A tibble: 1 × 13
#>       K weigh…¹ dist_…²   lon   lat .metric .esti…³   mean     n std_err .config
#>   <int> <chr>     <dbl> <int> <int> <chr>   <chr>    <dbl> <int>   <dbl> <chr>  
#> 1    21 cos       0.626     1     4 rmse    standa… 0.0746    10 0.00359 Prepro…
#> # … with 2 more variables: .best <dbl>, .bound <dbl>, and abbreviated variable
#> #   names ¹​weight_func, ²​dist_power, ³​.estimator
#> # ℹ Use `colnames()` to see all variable names
select_by_one_std_err(ames_grid_search, metric = "rmse", 1)
#> # A tibble: 1 × 13
#>       K weigh…¹ dist_…²   lon   lat .metric .esti…³   mean     n std_err .config
#>   <int> <chr>     <dbl> <int> <int> <chr>   <chr>    <dbl> <int>   <dbl> <chr>  
#> 1    21 cos       0.626     1     4 rmse    standa… 0.0746    10 0.00359 Prepro…
#> # … with 2 more variables: .best <dbl>, .bound <dbl>, and abbreviated variable
#> #   names ¹​weight_func, ²​dist_power, ³​.estimator
#> # ℹ Use `colnames()` to see all variable names

Created on 2022-07-26 by the reprex package (v2.0.1)

@DavisVaughan
Copy link
Member

Since the ... are passed on to arrange() and I guess we want things like desc(K) to work, there isn't a whole lot we can do here.

arrange() is data-masking and mutate-like, so we can't make the ... of select_by_one_std_err() tidyselect-like like I did with yardstick's metric functions like in tidymodels/yardstick#328

The only thing we could do is iterate over the dots we capture with dots <- rlang::enquos(...) and check that quo_is_symbol(quo) || quo_is_call(quo). If that isn't true, then the user probably provided a literal value like "K" or 1 and we should error there

@simonpcouch simonpcouch added the bug an unexpected problem or unintended behavior label Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants