-
Notifications
You must be signed in to change notification settings - Fork 106
Closed
Labels
bugan unexpected problem or unintended behavioran unexpected problem or unintended behavior
Description
Ranger models fit without any problems when using a sparse representation but produce an error about converting matrices to data frames when generating predictions. I guess this relates to when the predictions are being converted back from ranger's output? With issue 691 I think this means only glmnet is the only engine supported for sparse matrices.
library(tidyverse)
library(tidymodels)
#> Warning: package 'dials' was built under R version 4.1.2
#> Warning: package 'recipes' was built under R version 4.1.2
#> Warning: package 'workflowsets' was built under R version 4.1.2
data("small_fine_foods")
training_data
#> # A tibble: 4,000 × 3
#> product review score
#> <chr> <chr> <fct>
#> 1 B000J0LSBG "this stuff is not stuffing its not good at all save yo… other
#> 2 B000EYLDYE "I absolutely LOVE this dried fruit. LOVE IT. Whenever I … great
#> 3 B0026LIO9A "GREAT DEAL, CONVENIENT TOO. Much cheaper than WalMart and… great
#> 4 B00473P8SK "Great flavor, we go through a ton of this sauce! I discove… great
#> 5 B001SAWTNM "This is excellent salsa/hot sauce, but you can get it for … great
#> 6 B000FAG90U "Again, this is the best dogfood out there. One suggestion… great
#> 7 B006BXTCEK "The box I received was filled with teas, hot chocolates, a… other
#> 8 B002GWH5OY "This is delicious coffee which compares favorably with muc… great
#> 9 B003R0MFYY "Don't let these little tiny cans fool you. They pack a lo… great
#> 10 B001EO5ZXI "One of the nicest, smoothest cup of chai I've made. Nice m… great
#> # … with 3,990 more rows
library(hardhat)
#> Warning: package 'hardhat' was built under R version 4.1.2
sparse_bp <- default_recipe_blueprint(composition = "dgCMatrix")
library(textrecipes)
text_rec <-
recipe(score ~ review, data = training_data) %>%
step_tokenize(review) %>%
step_tokenfilter(review, max_tokens = 1e3) %>%
step_tfidf(review)
ranger_spec <-
rand_forest(mode = "classification") %>%
set_engine("ranger")
wf_sparse <-
workflow() %>%
add_recipe(text_rec, blueprint = sparse_bp) %>%
add_model(ranger_spec)
wf_default <-
workflow() %>%
add_recipe(text_rec) %>%
add_model(ranger_spec)
set.seed(123)
wf_default %>%
fit(training_data) %>%
predict(testing_data)
#> # A tibble: 1,000 × 1
#> .pred_class
#> <fct>
#> 1 great
#> 2 great
#> 3 other
#> 4 great
#> 5 great
#> 6 great
#> 7 great
#> 8 great
#> 9 great
#> 10 other
#> # … with 990 more rows
wf_sparse %>%
fit(training_data) %>%
predict(testing_data)
#> Error in as.data.frame.default(new_data): cannot coerce class 'structure("dgCMatrix", package = "Matrix")' to a data.frameCreated on 2022-03-26 by the reprex package (v2.0.1)
Metadata
Metadata
Assignees
Labels
bugan unexpected problem or unintended behavioran unexpected problem or unintended behavior