Skip to content

Conversation

@smurching
Copy link
Contributor

What changes were proposed in this pull request?

Based on Yggdrasil, added local training of decision tree regressors.

Some classes/objects largely correspond to Yggdrasil classes/objects.
Specifically:

  • class LocalDecisionTreeRegressor --> class YggdrasilRegressor
  • object LocalDecisionTree --> object YggdrasilRegression
  • object LocalDecisionTreeUtils --> object Yggdrasil

How was this patch tested?

Added unit tests in (ml/tree/impl/LocalTreeTrainingSuite.scala) verifying that local & distributed training of a decision tree regressor produces the same tree.

@smurching smurching changed the title Add local tree training for decision tree regressors [SPARK-3162] [MLlib] Add local tree training for decision tree regressors Aug 30, 2016
@smurching smurching changed the title [SPARK-3162] [MLlib] Add local tree training for decision tree regressors [SPARK-3162][MLlib] Add local tree training for decision tree regressors Aug 30, 2016
@smurching smurching changed the title [SPARK-3162][MLlib] Add local tree training for decision tree regressors [SPARK-3162][MLlib][WIP] Add local tree training for decision tree regressors Aug 30, 2016
@jkbradley
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Aug 30, 2016

Test build #64624 has finished for PR 14872 at commit aa4fcc8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


// Update current model and node periphery.
// Note: This flatMap has side effects (on the model).
activeNodePeriphery = LocalDecisionTreeUtils.computeActiveNodePeriphery(activeNodePeriphery,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just get the children from computeBestSplits.

@SparkQA
Copy link

SparkQA commented Sep 2, 2016

Test build #64879 has finished for PR 14872 at commit 8d443ce.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@smurching smurching closed this Sep 2, 2016
@smurching smurching reopened this Sep 27, 2016
@jkbradley
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Sep 28, 2016

Test build #66067 has finished for PR 14872 at commit ee56ffe.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 3, 2016

Test build #66249 has finished for PR 14872 at commit 6e3be3a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@tomlaube
Copy link

Hi, is there any progress on this?

@smurching
Copy link
Contributor Author

Hi,

I've stopped working on this PR - I can go ahead and close it.

@smurching smurching closed this Jan 24, 2017
@jkbradley
Copy link
Member

@smurching Sorry we haven't had time to continue with this. Please don't delete the branch; I'd like to pick it up eventually!

@smurching
Copy link
Contributor Author

No worries, apologies for being busy on my end -- I'll leave the branch up & try to contribute in other ways when I have the time!

@hollinwilkins
Copy link

Hey, would definitely like to see this included in Spark. We are having issues with OOM errors training pretty small RFs. It could have something to do with the cross validation implementation as well, as it stores all of the models in memory at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants