Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upError when training univariate recipes with sampling #875
Comments
|
A bug. Please install the current github version to make sure that it wolves your issues. |
|
Thanks! |
When using
trainon an object of classrecipethat has only one predictor variable, along with a sampling method specified intrainControl(), an error is thrown:Minimal dataset:
Minimal, runnable code:
It looks like this is due to lines in the
rec_model()function.Error when
sampling$first == TRUEFor cases where sub-sampling occurs before preprocessing,
sampling$first == TRUE, the error occurs here:If there is only one value for
other_cols(i.e. one non-outcome column), the resultother_datwill be a vector rather than adata.frameortbl. When this occurs,sampling$func(other_dat, y)assigns an arbitrary name to the column intmp$xdataframe, also calledx. When the framedatis later passed toprep(), the predictor name indat, now set tox, no longer matches the predictor name inrec, which is set toain this example. This causesprep() to throw an error, because it cannot find the predictor column.Error when
sampling$first == FALSEFor cases where sub-sampling occurs after preprocessing,
sampling$first == FALSE, the error occurs here:xis atbl, and when it has only one column,sampling$functreats it like a vector, causing the function to throw an error (I have tested forup,rose, andsmote).Example without error
Because of the
ifstatement at the top ofrec_modelthat checks for a sampling method, this problem only occurs if a sampling method is specified.Session Info:
Possible fix
I believe the way to fix it would be to change
other_dat <- dat[, other_cols]toother_dat <- dat[, other_cols,drop =FALSE]whensampling$first == TRUEand to addx <- as.data.frame(x)prior totmp <- sampling$func(x, y)whensampling$first == FALSE. I have tested these changes and they were successful. I can submit a pull request with these changes.