-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rborist Error from doTryCatch() #55
Comments
I have the same problem. Write me an email and I send you an example |
Hello:
I send you my code. I use 5 fold cross validation. I also send you the
files used for the experimentation (diabetes) and the code in R (with
somes comment). You must include the diabetes files in the directory
called "DIRECTORY" for been used in the experiments.
I also try to use ntree = 100 and thinLeaves=TRUE and autoCompress =
1.0 and I always obtain the same error:
Error in doTryCatch(return(expr), name, parentenv, handler) :
Training, prediction data types do not match
In 2017 I used this method without problem with the same code.
Best regards
El mié, 9 nov 2022 a las 0:39, suiji ***@***.***>) escribió:
… Github does not appear to offer a way to reach you directly. Please send
your example to ***@***.***
Thank you for your help. It's very difficult to convince people to report
bugs.
—
Reply to this email directly, view it on GitHub
<#55 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/A4CP2CZM6SXCPWTSFOLAWL3WHLQEHANCNFSM5TQ7EXIA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Will be happy to run your test code. Please feel free to send it when you are ready. In the meantime, though, the error message you encountered is complaining about a mismatch between the data frames employed for training and prediction. The package's "deframer" phase repacks data frames into distinct blocks of values having the same data type (numeric or factor, for example). Right now, the deframer expects the predictors to appear in the same order, and have the same data type, in both frames. We are loosening this requirement by means of a "keyed" option, which will match predictors in the two frames in arbitrary order by keying off their names. This option did not make it into 0.3-2, which had to be posted on CRAN under deadline. We do intend to support "keyed" in the next release. Could this be the source of your problem? Regards, |
No example has been received so far, but we're ready to help when it arrives. Please note that setting autocompress to 1.0 was a solution to a problem appearing version 0.2-4 and should no longer be relevant. Setting thinLeaves is strictly for reducing memory footprint, so is also unlikely to apply. |
I send you last week. Probably I include incorrect email. Sorry
I use 5 fold cross-validation..
El mié, 9 nov 2022 a las 0:39, suiji ***@***.***>) escribió:
… Github does not appear to offer a way to reach you directly. Please send
your example to ***@***.***
Thank you for your help. It's very difficult to convince people to report
bugs.
—
Reply to this email directly, view it on GitHub
<#55 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/A4CP2CZM6SXCPWTSFOLAWL3WHLQEHANCNFSM5TQ7EXIA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Thank you. Your example reproduces the behavior you describe. The error message is complaining that the training and prediction data frames do not match and, in this example, they do not. The training frame contains two predictor columns, while the prediction frame contains three. The example can easily be made to work by filtering out the third column ("Y") for prediction. Traditionally, the package has offered only a positional scheme for reconciling data frames between, say, training and prediction. That is, the columns in the training frame were assumed to match those in the prediction frame. Some checking was performed to ensure, at the very least, that data types agree at their respective positions across the two frames. With release 0.3 we are also checking that the two frames have the same number of predictors. If your example does not fail with earlier releases it is likely because we were not performing the additional check. In addition to the positional scheme we are planning to introduce a "keyed" (or maybe "keyedFrame") option which will allow the column positions to vary between the two frames. In particular, there would be no problem with the training frame having fewer columns than the prediction frame, so long as the latter includes all columns present in the training frame - and that the respective types agree. |
I am not an expert in R. How can I filter the Y variable in my code? as you
suggest to fix the error. Thank you
El jue, 17 nov 2022 a las 23:28, suiji ***@***.***>)
escribió:
… Thank you. Your example reproduces the behavior you describe.
The error message is complaining that the training and prediction data
frames do not match and, in this example, they do not. The training frame
contains two predictor columns, while the prediction frame contains three.
The example can easily be made to work by filtering out the third column
("Y") for prediction.
Traditionally, the package has offered only a positional scheme for
reconciling data frames between, say, training and prediction. That is, the
columns in the training frame were assumed to match those in the prediction
frame. Some checking was performed to ensure, at the very least, that data
types agree at their respective positions across the two frames. With
release 0.3 we are also checking that the two frames have the same number
of predictors. If your example does not fail with earlier releases it is
likely because we were not performing the additional check.
In addition to the positional scheme we are planning to introduce a
"keyed" (or maybe "keyedFrame") option which will allow the column
positions to vary between the two frames. In particular, there would be no
problem with the training frame having fewer columns than the prediction
frame, so long as the latter includes all columns present in the training
frame - and that the respective types agree.
—
Reply to this email directly, view it on GitHub
<#55 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/A4CP2C73HOFZPKC6I7ELOW3WI2WP5ANCNFSM5TQ7EXIA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
The easiest way to filter out a column is probably just to place a minus sign in front of it. Rborist's predict() method computes MSE as a side-effect, moreover, when passed with a test vector. So you can probably save some work by applying the following codelet, which omits column 3 from the new data but passes it as a test vector: yPrime <- predict(fitMulti, test[,-3], test[,3]) |
Closing this thread. Please feel free to reopen or begin a new thread. |
GitHub reports steady search activity for this and similar trapped-error messages. None of the tests we have on hand report premature exit. If someone has a reproducible test case, however, please help out by responding to this Issue or opening a new bug report.
Thank you.
The text was updated successfully, but these errors were encountered: