-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Join warnings generated during model fitting #526
Comments
Hello @mattwarkentin 👋 Thanks for the small reprex! What version of {tune} do you have installed? I'm not able to reproduce your error using the most recent CRAN version of {tune} library(tidymodels)
workflow(
preprocessor = mpg ~ .,
spec = linear_reg(engine = 'glmnet', penalty = tune(), mixture = 1)
) %>%
tune_grid(resamples = vfold_cv(mtcars))
#> # Tuning results
#> # 10-fold cross-validation
#> # A tibble: 10 × 4
#> splits id .metrics .notes
#> <list> <chr> <list> <list>
#> 1 <split [28/4]> Fold01 <tibble [20 × 5]> <tibble [0 × 3]>
#> 2 <split [28/4]> Fold02 <tibble [20 × 5]> <tibble [0 × 3]>
#> 3 <split [29/3]> Fold03 <tibble [20 × 5]> <tibble [0 × 3]>
#> 4 <split [29/3]> Fold04 <tibble [20 × 5]> <tibble [0 × 3]>
#> 5 <split [29/3]> Fold05 <tibble [20 × 5]> <tibble [0 × 3]>
#> 6 <split [29/3]> Fold06 <tibble [20 × 5]> <tibble [0 × 3]>
#> 7 <split [29/3]> Fold07 <tibble [20 × 5]> <tibble [0 × 3]>
#> 8 <split [29/3]> Fold08 <tibble [20 × 5]> <tibble [0 × 3]>
#> 9 <split [29/3]> Fold09 <tibble [20 × 5]> <tibble [0 × 3]>
#> 10 <split [29/3]> Fold10 <tibble [20 × 5]> <tibble [0 × 3]>
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.2.1 (2022-06-23)
#> os macOS Monterey 12.2.1
#> system aarch64, darwin20
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/Los_Angeles
#> date 2022-07-14
#> pandoc 2.17.1.1 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.0)
#> backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.0)
#> broom * 1.0.0 2022-07-01 [1] CRAN (R 4.2.0)
#> class 7.3-20 2022-01-16 [1] CRAN (R 4.2.1)
#> cli 3.3.0 2022-04-25 [1] CRAN (R 4.2.0)
#> codetools 0.2-18 2020-11-04 [1] CRAN (R 4.2.1)
#> colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.0)
#> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0)
#> DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.0)
#> dials * 1.0.0 2022-06-14 [1] CRAN (R 4.2.0)
#> DiceDesign 1.9 2021-02-13 [1] CRAN (R 4.2.0)
#> digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.0)
#> dplyr * 1.0.9 2022-04-28 [1] CRAN (R 4.2.0)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
#> evaluate 0.15 2022-02-18 [1] CRAN (R 4.2.0)
#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0)
#> foreach 1.5.2 2022-02-02 [1] CRAN (R 4.2.0)
#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0)
#> furrr 0.3.0 2022-05-04 [1] CRAN (R 4.2.0)
#> future 1.26.1 2022-05-27 [1] CRAN (R 4.2.0)
#> future.apply 1.9.0 2022-04-25 [1] CRAN (R 4.2.0)
#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0)
#> ggplot2 * 3.3.6 2022-05-03 [1] CRAN (R 4.2.0)
#> glmnet * 4.1-4 2022-04-15 [1] CRAN (R 4.2.0)
#> globals 0.15.1 2022-06-24 [1] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
#> gower 1.0.0 2022-02-03 [1] CRAN (R 4.2.0)
#> GPfit 1.0-8 2019-02-08 [1] CRAN (R 4.2.0)
#> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.2.0)
#> hardhat 1.2.0 2022-06-30 [1] CRAN (R 4.2.1)
#> highr 0.9 2021-04-16 [1] CRAN (R 4.2.0)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.0)
#> infer * 1.0.2 2022-06-10 [1] CRAN (R 4.2.0)
#> ipred 0.9-13 2022-06-02 [1] CRAN (R 4.2.0)
#> iterators 1.0.14 2022-02-05 [1] CRAN (R 4.2.0)
#> knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0)
#> lattice 0.20-45 2021-09-22 [1] CRAN (R 4.2.1)
#> lava 1.6.10 2021-09-02 [1] CRAN (R 4.2.0)
#> lhs 1.1.5 2022-03-22 [1] CRAN (R 4.2.0)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.2.0)
#> listenv 0.8.0 2019-12-05 [1] CRAN (R 4.2.0)
#> lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.2.0)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
#> MASS 7.3-57 2022-04-22 [1] CRAN (R 4.2.0)
#> Matrix * 1.4-1 2022-03-23 [1] CRAN (R 4.2.1)
#> modeldata * 1.0.0 2022-07-01 [1] CRAN (R 4.2.0)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0)
#> nnet 7.3-17 2022-01-16 [1] CRAN (R 4.2.1)
#> parallelly 1.32.0 2022-06-07 [1] CRAN (R 4.2.0)
#> parsnip * 1.0.0 2022-06-16 [1] CRAN (R 4.2.0)
#> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
#> prodlim 2019.11.13 2019-11-17 [1] CRAN (R 4.2.0)
#> purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.2.0)
#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.2.0)
#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0)
#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0)
#> R.utils 2.12.0 2022-06-28 [1] CRAN (R 4.2.0)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
#> Rcpp 1.0.8.3 2022-03-17 [1] CRAN (R 4.2.0)
#> recipes * 1.0.1 2022-07-07 [1] CRAN (R 4.2.1)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0)
#> rlang 1.0.4 2022-07-12 [1] CRAN (R 4.2.1)
#> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0)
#> rpart 4.1.16 2022-01-24 [1] CRAN (R 4.2.1)
#> rsample * 1.0.0 2022-06-24 [1] CRAN (R 4.2.0)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0)
#> scales * 1.2.0 2022-04-13 [1] CRAN (R 4.2.0)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
#> shape 1.4.6 2021-05-19 [1] CRAN (R 4.2.0)
#> stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.1)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.2.0)
#> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0)
#> survival 3.3-1 2022-03-03 [1] CRAN (R 4.2.1)
#> tibble * 3.1.7 2022-05-03 [1] CRAN (R 4.2.0)
#> tidymodels * 1.0.0 2022-07-13 [1] CRAN (R 4.2.1)
#> tidyr * 1.2.0 2022-02-01 [1] CRAN (R 4.2.0)
#> tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.2.0)
#> timeDate 3043.102 2018-02-21 [1] CRAN (R 4.2.0)
#> tune * 1.0.0 2022-07-07 [1] CRAN (R 4.2.1)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0)
#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
#> workflows * 1.0.0 2022-07-05 [1] CRAN (R 4.2.0)
#> workflowsets * 1.0.0 2022-07-12 [1] CRAN (R 4.2.1)
#> xfun 0.31 2022-05-10 [1] CRAN (R 4.2.0)
#> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0)
#> yardstick * 1.0.0 2022-06-06 [1] CRAN (R 4.2.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library
#>
#> ────────────────────────────────────────────────────────────────────────────── Created on 2022-07-14 by the reprex package (v2.0.1) |
Seconding Emil, thanks for the small reprex. :) Just dropping a note that I'm able to reproduce with dev dplyr: library(tidymodels)
workflow(
preprocessor = mpg ~ .,
spec = linear_reg(engine = 'glmnet', penalty = tune(), mixture = 1)
) %>%
tune_grid(resamples = vfold_cv(mtcars))
#> ! Fold01: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> ! Fold02: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> ! Fold03: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> ! Fold04: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> ! Fold05: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> ! Fold06: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> ! Fold07: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> ! Fold08: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> ! Fold09: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> ! Fold10: preprocessor 1/1, model 1/1 (predictions): Each row in `x` should match at most 1 row in `y`.
#> # Tuning results
#> # 10-fold cross-validation
#> # A tibble: 10 × 4
#> splits id .metrics .notes
#> <list> <chr> <list> <list>
#> 1 <split [28/4]> Fold01 <tibble [20 × 5]> <tibble [1 × 3]>
#> 2 <split [28/4]> Fold02 <tibble [20 × 5]> <tibble [1 × 3]>
#> 3 <split [29/3]> Fold03 <tibble [20 × 5]> <tibble [1 × 3]>
#> 4 <split [29/3]> Fold04 <tibble [20 × 5]> <tibble [1 × 3]>
#> 5 <split [29/3]> Fold05 <tibble [20 × 5]> <tibble [1 × 3]>
#> 6 <split [29/3]> Fold06 <tibble [20 × 5]> <tibble [1 × 3]>
#> 7 <split [29/3]> Fold07 <tibble [20 × 5]> <tibble [1 × 3]>
#> 8 <split [29/3]> Fold08 <tibble [20 × 5]> <tibble [1 × 3]>
#> 9 <split [29/3]> Fold09 <tibble [20 × 5]> <tibble [1 × 3]>
#> 10 <split [29/3]> Fold10 <tibble [20 × 5]> <tibble [1 × 3]>
#>
#> There were issues with some computations:
#>
#> - Warning(s) x10: Each row in `x` should match at most 1 row in `y`.
#>
#> Run `show_notes(.Last.tune.result)` for more information.
packageVersion("dplyr")
#> [1] '1.0.99.9000' Created on 2022-07-15 by the reprex package (v2.0.1) From their PR 5910 and 6269. |
A partial fix is at tidymodels/parsnip#772! Unfortunately, this doesn't actually arise from the tune grid paths, so this bug might live in many places. Comes from Thanks for pointing this out, @mattwarkentin. |
Some notes at https://gist.github.com/simonpcouch/b47e618fa6ebac6ed4995765169a87bb. This round of join errors came up in Related issue at #528. |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
Hi,
I am not actually sure if this belong in
{tune}
or not...but anyway...a new warning is popping up when fitting models.I think the issues happens due to the newer versions of the
*_join()
functions from{dplyr}
which now have the argumentmultiple
which handles what happens when there are 1-to-many matches. If there are multiple matches it now emits a warning, by default. Not sure why the warning is being emitted in the example below, but it is.The text was updated successfully, but these errors were encountered: