Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make function fast_classification() #65

Closed
spsanderson opened this issue Dec 30, 2022 · 0 comments
Closed

Make function fast_classification() #65

spsanderson opened this issue Dec 30, 2022 · 0 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@spsanderson
Copy link
Owner

spsanderson commented Dec 30, 2022

Function:

#' Generate Model Specification calls to `parsnip`
#'
#' @family Model_Generator
#'
#' @author Steven P. Sanderson II, MPH
#'
#' @details With this function you can generate a tibble output of any classification
#' model specification and it's fitted `workflow` object. Per recipes documentation 
#' explicitly with `step_string2factor()` it is encouraged to mutate your predictor
#' into a factor before you create your recipe.
#'
#' @description Creates a list/tibble of parsnip model specifications.
#'
#' @param .data The data being passed to the function for the classification problem
#' @param .rec_obj The recipe object being passed.
#' @param .parsnip_fns The default is 'all' which will create all possible
#' classification model specifications supported.
#' @param .parsnip_eng the default is 'all' which will create all possible
#' classification model specifications supported.
#' @param .split_type The default is 'initial_split', you can pass any type of
#' split supported by `rsample`
#' @param .split_args The default is NULL, when NULL then the default parameters
#' of the split type will be executed for the rsample split type.
#'
#' @examples
#' library(recipes, quietly = TRUE)
#' library(dplyr, quietly = TRUE)
#' 
#' df <- mtcars %>% mutate(cyl = as.factor(cyl))
#' rec_obj <- recipe(cyl ~ ., data = df)
#'   
#' fct_tbl <- fast_classification(
#'   .data = df, 
#'   .rec_obj = rec_obj, 
#'   .parsnip_eng = c("glm","LiblineaR"))
#'   
#' glimpse(fct_tbl)
#'
#' @return
#' A list or a tibble.
#'
#' @name fast_classification
NULL

#' @export
#' @rdname fast_classification
#'

fast_classification <- function(.data, .rec_obj, .parsnip_fns = "all",
                            .parsnip_eng = "all", .split_type = "initial_split",
                            .split_args = NULL){
  
  # Tidy Eval ----
  call <- list(.parsnip_fns) %>%
    purrr::flatten_chr()
  engine <- list(.parsnip_eng) %>%
    purrr::flatten_chr()
  
  rec_obj <- .rec_obj
  split_type <- .split_type
  split_args <- .split_args
  
  # Checks ----
  
  # Get data splits
  df <- dplyr::as_tibble(.data)
  splits_obj <- create_splits(
    .data = df,
    .split_type = split_type,
    .split_args = split_args
  )
  
  # Generate Model Spec Tbl
  mod_spec_tbl <- fast_classification_parsnip_spec_tbl(
    .parsnip_fns = call,
    .parsnip_eng = engine
  )
  
  # Generate Workflow object
  mod_tbl <- mod_spec_tbl %>%
    dplyr::mutate(
      wflw = internal_make_wflw(mod_spec_tbl, .rec_obj = rec_obj)
    )
  
  mod_fitted_tbl <- mod_tbl %>%
    dplyr::mutate(
      fitted_wflw = internal_make_fitted_wflw(mod_tbl, splits_obj)
    )
  
  mod_pred_tbl <- mod_fitted_tbl %>%
    dplyr::mutate(
      pred_wflw = internal_make_wflw_predictions(mod_fitted_tbl, splits_obj)
    )
  
  
  # Return ----
  class(mod_tbl) <- c("fst_reg_tbl", class(mod_tbl))
  attr(mod_tbl, ".parsnip_engines") <- .parsnip_eng
  attr(mod_tbl, ".parsnip_functions") <- .parsnip_fns
  attr(mod_tbl, ".split_type") <- .split_type
  attr(mod_tbl, ".split_args") <- .split_args
  
  return(mod_pred_tbl)
}

Example:

> library(recipes, quietly = TRUE)
> library(dplyr, quietly = TRUE)
> 
> df <- mtcars %>% mutate(cyl = as.factor(cyl))
> rec_obj <- recipe(cyl ~ ., data = df)
>   
> fct_tbl <- fast_classification(
+   .data = df, 
+   .rec_obj = rec_obj, 
+   .parsnip_eng = c("glm","LiblineaR"))
Warning message:
Problem while computing `fitted_wflw = internal_make_fitted_wflw(mod_tbl, splits_obj)`.glm.fit: fitted probabilities numerically 0 or 1 occurred 
>   
> glimpse(fct_tbl)
Rows: 3
Columns: 8
$ .model_id       <int> 1, 2, 3
$ .parsnip_engine <chr> "glm", "LiblineaR", "LiblineaR"
$ .parsnip_mode   <chr> "classification", "classification", "classification"
$ .parsnip_fns    <chr> "logistic_reg", "logistic_reg", "svm_linear"
$ model_spec      <list> [~NULL, ~NULL, NULL, classification, TRUE, NULL, glm, TRUE], [~N$ wflw            <list> [mpg, disp, hp, drat, wt, qsec, vs, am, gear, carb, cyl, double,…
$ fitted_wflw     <list> [mpg, disp, hp, drat, wt, qsec, vs, am, gear, carb, cyl, double,…
$ pred_wflw       <list> [<tbl_df[24 x 1]>], [<tbl_df[24 x 1]>], [<tbl_df[24 x 1]>]
@spsanderson spsanderson added the enhancement New feature or request label Dec 30, 2022
@spsanderson spsanderson added this to the tidyAML 0.0.1 milestone Dec 30, 2022
@spsanderson spsanderson self-assigned this Dec 30, 2022
spsanderson added a commit that referenced this issue Dec 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

1 participant