New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Q: How to preserve dataframe columns alongside predictions? #194
Comments
Hi @doctapp, in H2O we guarantee order of rows - so if you have a frame with rows [A, B, C], then prediction frame will follow the same order [PA, PB, PC]. |
If you have val dataFrame: H2OFrame = ...
val predFrame: H2OFrame = ...
dataAndPredFrame = dataFrame.add(predFrame) Note: The result |
Thanks for the tip, but that's internal to the ML bindings... I guess that
would require hacking H2OModel.transform right?
…On Wed, Mar 1, 2017 at 12:45 PM, Michal Malohlava ***@***.***> wrote:
If you have H2OFrame DATA and prediction frame P you can write
val dataFrame: H2OFrame = ...val predFrame: H2OFrame = ...
dataAndPredFrame = dataFrame.add(predFrame)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#194 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADbqNMX6sFvDJfJb2MGvQJTMmB5OgHm5ks5rha7VgaJpZM4MPy7P>
.
|
@doctapp are you using Scala API? in such case it is another expression after model predict call. val predFrame = gbmModel.predict(dataFrame)
dataAndPredFrame = dataFrame.add(predFrame) If you are using R/Python interface you need to use |
Worked! What I did was changes H2OModel.scala to make it consistent with
other ML packages:
override def transform(dataset: Dataset[_]): DataFrame = {
val frame: H2OFrame = h2oContext.asH2OFrame(dataset.toDF())
val prediction = model.score(frame)
// Preserve original dataset by appending predictions
h2oContext.asDataFrame(frame.add(prediction))(sqlContext).withColumnRenamed("predict",
"prediction")
}
Thanks
…On Wed, Mar 1, 2017 at 1:14 PM, Michal Malohlava ***@***.***> wrote:
@doctapp <https://github.com/doctapp> are you using Scala API? in such
case it is another expression after model predict call.
val predFrame = gbmModel.predict(dataFrame)
dataAndPredFrame = dataFrame.add(predFrame)
If you are using R/Python interface you need to use cbind to join both
tables together.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#194 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADbqNFCcTUKH3KStJujvrrLiZEjCcJtyks5rhbWYgaJpZM4MPy7P>
.
|
👍 nice! |
We can do that as optional parameter for ML package... |
I've trained a GBM model using the ML bindings. The problem is there's only a "predict" column when predicting. How can we preserve the original dataframe columns? I don't have any context to join back the predictions (I shouldn't need to join btw).
Thanks
The text was updated successfully, but these errors were encountered: