-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6684] [mllib] [ml] Add checkpointing to GBTs #7804
Conversation
@@ -144,6 +144,7 @@ final class EMLDAOptimizer extends LDAOptimizer { | |||
this.checkpointInterval = lda.getCheckpointInterval | |||
this.graphCheckpointer = new PeriodicGraphCheckpointer[TopicCounts, TokenCount]( | |||
checkpointInterval, graph.vertices.sparkContext) | |||
this.graphCheckpointer.update(this.graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have been done in the previous PR.
@@ -269,6 +269,8 @@ object GradientBoostedTrees extends Logging { | |||
logInfo("Internal timing for DecisionTree:") | |||
logInfo(s"$timer") | |||
|
|||
predErrorCheckpointer.deleteAllCheckpoints() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in this PR, but what if we want to keep the last RDD checkpointed in the queue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not be a problem for recoverability (unless the driver crashes), but that sounds useful for model stats. To do!
LGTM except one minor comment. |
Test build #39097 has finished for PR 7804 at commit
|
Test build #39101 has finished for PR 7804 at commit
|
Test build #39111 has finished for PR 7804 at commit
|
merging with master. Thanks for the review! |
Add checkpointing to GradientBoostedTrees, GBTClassifier, GBTRegressor
CC: @mengxr