SPARK-8078 #6611

yang1young · 2015-06-03T13:37:29Z

In Spark MLlib, Decision Trees use Gini impurity, Entropy and Variance as impurity. The Entropy impurity implement by calculating the Info Gain, which is put forward by J. Ross Quinlan in ID3 algorithm. And it can be improved by implementing C4.5 algorithm,which using Info Gain Ratio instead of Info Gain to calculate impurity. By implementing C4.5 algorithm, the Decision Trees model can achieve higher forecast accuracy in most cases.
https://issues.apache.org/jira/browse/SPARK-8078

AmplabJenkins · 2015-06-03T13:42:12Z

Can one of the admins verify this patch?

srowen · 2015-06-03T14:34:51Z

mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala

All of these lines don't match the project code style. Please read https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

sryza · 2015-06-03T19:00:26Z

Mind giving this a more descriptive title that includes [MLLIB]?

srowen · 2015-06-04T07:21:03Z

OK, if you're closing this JIRA, do you mind closing this PR?

change_1

f869875

srowen reviewed Jun 3, 2015
View reviewed changes

yang1young closed this Jun 4, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARK-8078 #6611

SPARK-8078 #6611

Uh oh!

yang1young commented Jun 3, 2015

Uh oh!

AmplabJenkins commented Jun 3, 2015

Uh oh!

srowen Jun 3, 2015

Uh oh!

sryza commented Jun 3, 2015

Uh oh!

srowen commented Jun 4, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SPARK-8078 #6611

SPARK-8078 #6611

Uh oh!

Conversation

yang1young commented Jun 3, 2015

Uh oh!

AmplabJenkins commented Jun 3, 2015

Uh oh!

srowen Jun 3, 2015

Choose a reason for hiding this comment

Uh oh!

sryza commented Jun 3, 2015

Uh oh!

srowen commented Jun 4, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants