Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upError when using knnImpute pre-processing option for dataset containing factor #404
Comments
|
You aren't missing anything; it is a bug and you are correct about the issue. I checked in code to fix the issue. Thanks, Max |
Hello,
here is a reproducible example:
This does not produce errors when using "medianImpute". When I remove the only factor from the data, it works as expected:
my session info is attached below. I already dug through the code for a bit and it seems that the
olddata used by thennimpfunction (stored in pp$data) does not contain the factor columns and the function then tries to select all non-NA columns of the new data from the old data (old[, non_missing_cols, drop = FALSE]), which fails because the factors are not NA but not in the old data.Is this a bug, or am I missing something when it comes to using the "knnImpute" option?