make sure all encodings are done correctly #11

EmilHvitfeldt · 2020-08-23T16:30:51Z

encodings -> parsnip encodings, set by parsnip::set_encodings()

The text was updated successfully, but these errors were encountered:

EmilHvitfeldt · 2020-10-25T22:56:11Z

Wait for #5, #6, #7, #8

hfrick · 2024-01-10T14:45:48Z

All engines have a formula interface (at least we are telling parsnip that about glmnet), and most have the unsurprising encodings of predictor_indicators = "none", include_intercept = FALSE, and remove_intercept = FALSE, leaving the indicators and the intercept to the engine -- which is sensible for an engine with a formula interface.
The exceptions are bag_tree(engine = "rpart") and glmnet. The glmnet encodings for predictors and intercept are correct, the ones for the bagged tree should switch to predictor_indicators = "none".

In terms of sparsity:

only glmnet allows that which is generally correct but might not be true for this case here, see glmnet and sparse matrices #276
the mboost package includes support for sparse matrices but not for mboost::blackboost() which is what we are using
the rest do not, as far as I can tell

library(censored)
#> Loading required package: parsnip
#> Loading required package: survival
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(purrr)

mod_names <- get_from_env("models")

model_interface <-
  purrr::map_dfr(mod_names, ~ get_from_env(paste0(.x, "_fit")) %>%
                   mutate(model = .x)) %>% 
  mutate(interface = map_chr(value, 1)) %>% 
  select(engine, mode, model, interface)

model_encodings <-
  purrr::map_dfr(mod_names, ~ get_from_env(paste0(.x, "_encoding"))) %>% 
  #left_join(model_interface, by = join_by(model, engine, mode)) %>% 
  filter(mode == "censored regression") 

model_encodings %>% 
  #group_by(interface) %>%
  count(predictor_indicators, compute_intercept, remove_intercept, allow_sparse_x)
#> # A tibble: 3 × 5
#>   predictor_indicators compute_intercept remove_intercept allow_sparse_x     n
#>   <chr>                <lgl>             <lgl>            <lgl>          <int>
#> 1 none                 FALSE             FALSE            FALSE              9
#> 2 traditional          FALSE             FALSE            FALSE              1
#> 3 traditional          TRUE              TRUE             TRUE               1

model_encodings %>% 
  filter(predictor_indicators == "traditional")
#> # A tibble: 2 × 7
#>   model     engine mode  predictor_indicators compute_intercept remove_intercept
#>   <chr>     <chr>  <chr> <chr>                <lgl>             <lgl>           
#> 1 bag_tree  rpart  cens… traditional          FALSE             FALSE           
#> 2 proporti… glmnet cens… traditional          TRUE              TRUE            
#> # ℹ 1 more variable: allow_sparse_x <lgl>

model_encodings %>% 
  filter(allow_sparse_x)
#> # A tibble: 1 × 7
#>   model     engine mode  predictor_indicators compute_intercept remove_intercept
#>   <chr>     <chr>  <chr> <chr>                <lgl>             <lgl>           
#> 1 proporti… glmnet cens… traditional          TRUE              TRUE            
#> # ℹ 1 more variable: allow_sparse_x <lgl>

^{Created on 2024-01-10 with reprex v2.0.2}

github-actions · 2024-01-25T00:59:02Z

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

hfrick added the feature a feature request or enhancement label Apr 16, 2021

hfrick mentioned this issue Jan 10, 2024

Fix encodings #291

Merged

hfrick closed this as completed in #291 Jan 10, 2024

github-actions bot locked and limited conversation to collaborators Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make sure all encodings are done correctly #11

make sure all encodings are done correctly #11

EmilHvitfeldt commented Aug 23, 2020 •

edited by hfrick

Loading

EmilHvitfeldt commented Oct 25, 2020

hfrick commented Jan 10, 2024

github-actions bot commented Jan 25, 2024

make sure all encodings are done correctly #11

make sure all encodings are done correctly #11

Comments

EmilHvitfeldt commented Aug 23, 2020 • edited by hfrick Loading

EmilHvitfeldt commented Oct 25, 2020

hfrick commented Jan 10, 2024

github-actions bot commented Jan 25, 2024

EmilHvitfeldt commented Aug 23, 2020 •

edited by hfrick

Loading