Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add collect_*() function for extracted objects #579

Merged
merged 4 commits into from
Nov 28, 2022
Merged

Conversation

simonpcouch
Copy link
Contributor

Closes #409.🦃 I opted to call this helper collect_extracts() rather than the suggested name in the issue for consistency with other collect_*() function names, but am open to discussion here.

library(tidymodels)

res_fit <- 
  fit_resamples(
    linear_reg(),
    mpg ~ .,
    bootstraps(mtcars, 5),
    control = control_resamples(extract = extract_fit_engine)
  )

res_nothing <- 
  fit_resamples(
    linear_reg(),
    mpg ~ .,
    bootstraps(mtcars, 5)
  )

res_error <- 
  fit_resamples(
    linear_reg(),
    mpg ~ .,
    bootstraps(mtcars, 5),
    control = control_resamples(extract = function(x) {stop("no!")})
  )
#> x Bootstrap1: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): no!
#> x Bootstrap2: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): no!
#> x Bootstrap3: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): no!
#> x Bootstrap4: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): no!
#> x Bootstrap5: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): no!

collect_extracts(res_fit)
#> # A tibble: 5 × 3
#>   id         .extracts .config             
#>   <chr>      <list>    <chr>               
#> 1 Bootstrap1 <lm>      Preprocessor1_Model1
#> 2 Bootstrap2 <lm>      Preprocessor1_Model1
#> 3 Bootstrap3 <lm>      Preprocessor1_Model1
#> 4 Bootstrap4 <lm>      Preprocessor1_Model1
#> 5 Bootstrap5 <lm>      Preprocessor1_Model1

collect_extracts(res_nothing)
#> Error in `collect_extracts()`:
#> ! Failed to collect extracted objects.
#> ℹ Please supply a control object (`?tune::control_grid()`) with a non-`NULL`
#>   `extract` argument during resample fitting.

collect_extracts(res_error)
#> # A tibble: 5 × 3
#>   id         .extracts      .config             
#>   <chr>      <list>         <chr>               
#> 1 Bootstrap1 <try-errr [1]> Preprocessor1_Model1
#> 2 Bootstrap2 <try-errr [1]> Preprocessor1_Model1
#> 3 Bootstrap3 <try-errr [1]> Preprocessor1_Model1
#> 4 Bootstrap4 <try-errr [1]> Preprocessor1_Model1
#> 5 Bootstrap5 <try-errr [1]> Preprocessor1_Model1

Created on 2022-11-17 with reprex v2.0.2

Interactively in the IDE, the error message with collect_extracts(res_nothing) formats control object as a hyperlink to those docs, rather than printing the link.

@simonpcouch simonpcouch requested a review from hfrick November 22, 2022 18:06
Copy link
Member

@hfrick hfrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the name!

I think it would be helpful if the error said that there was nothing to collect, rather than that collection failed in general (there could be so many reasons for this). This is also in the spirit of what collect_predictions() does.

collect_notes() returns location while collect_extracts() return .config. They look similar; are they the same information just displayed differently?

Do we still want a collect_glmnet_coefficients()? If so, then let's open a new issue for it so that this idea doesn't get lost when closing #409

@simonpcouch
Copy link
Contributor Author

I think it would be helpful if the error said that there was nothing to collect, rather than that collection failed in general (there could be so many reasons for this). This is also in the spirit of what collect_predictions() does.

Agreed! Addressed in 265e311. :)

collect_notes() returns location while collect_extracts() return .config. They look similar; are they the same information just displayed differently?

Similar information, though my understanding is that locations are put together for interactive debugging—they only occur when things go wrong, and also include information on which part of the tune code path (i.e. within-Preproc/Model/Repeat/Iter) the issue occurred in—whereas .configs are constructed for straightforward joining.

Do we still want a collect_glmnet_coefficients()? If so, then let's open a new issue for it so that this idea doesn't get lost when closing #409

You got it! #582

@simonpcouch simonpcouch requested a review from hfrick November 28, 2022 14:19
Copy link
Member

@hfrick hfrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 Nice!

@simonpcouch simonpcouch merged commit b0d5254 into main Nov 28, 2022
@simonpcouch simonpcouch deleted the extracts-409 branch November 28, 2022 14:54
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

collect_extracted_objects
2 participants