consistently avoid dense matrix conversion for glmnet(x = ...) #1315

powerpak · 2022-10-22T22:22:43Z

(This is a more complete version of the fix submitted for #1096.)

Currently, some of the functions for the glmnet model check if the training data is a sparseMatrix, and some don't. The result is that initial operations in train() might succeed, and then later in the workflow, a step will fail (usually with "Cholmod error 'problem too large'" for a sparseMatrix with very large dimensions) because some of the training data is inadvertently converted to a (impossibly large) dense Matrix.

For instance, this bug currently occurs whenever prob() in glmnet.R is called (which happens if trainControl(classProbs = TRUE)), or if tuneLength is used instead of tuneGrid for train(), because tuneLength = ... triggers a call to grid() in glmnet.R which does not check for a sparseMatrix before executing Matrix::as.matrix().

currently, some of the functions for the glmnet model check if the training data is a sparseMatrix, and some don't. the result is that initial operations in train() might succeed, and then later in the workflow, a step will fail (usually with "Cholmod error 'problem too large'" for a sparseMatrix with very large dimensions) when some of the training data is inadvertently converted to a dense Matrix. For instance, this currently would happen when prob() in glmnet.R is called (if trainControl(classProbs = TRUE)), or if tuneLength is used instead of tuneGrid in train(), because tuneLength = ... triggers a call to grid() in glmnet.R which does not do a check for a sparseMatrix. this is essentially a more complete version of the fix for topepo#1096

topepo · 2023-03-21T15:35:43Z

@powerpak thanks for making this change

powerpak and others added 6 commits October 22, 2022 18:16

Merge branch 'master' into powerpak-fix-glmnet-sparsematrix

a4b0a1f

add classification tests

039c57d

reup the model file

2703cd4

update news

27dc8a6

Merge branch 'master' into powerpak-fix-glmnet-sparsematrix

26650ff

topepo merged commit 31b5faf into topepo:master Mar 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consistently avoid dense matrix conversion for glmnet(x = ...) #1315

consistently avoid dense matrix conversion for glmnet(x = ...) #1315

powerpak commented Oct 22, 2022

topepo commented Mar 21, 2023

consistently avoid dense matrix conversion for glmnet(x = ...) #1315

consistently avoid dense matrix conversion for glmnet(x = ...) #1315

Conversation

powerpak commented Oct 22, 2022

topepo commented Mar 21, 2023