Skip to content

Conversation

@jkbradley
Copy link
Member

This modifies DecisionTreeMetadata construction to treat 1-category features as continuous, so that trees do not fail with such features. It is important for the pipelines API, where VectorIndexer can automatically categorize certain features as categorical.

As stated in the JIRA, this is a temp fix which we can improve upon later by automatically filtering out those features. That will take longer, though, since it will require careful indexing.

Targeted for 1.5 and master

CC: @manishamde @mengxr @yanboliang

@SparkQA
Copy link

SparkQA commented Aug 14, 2015

Test build #40845 has finished for PR 8187 at commit df0ebb7.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Aug 14, 2015

LGTM except the style issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after ,

@SparkQA
Copy link

SparkQA commented Aug 14, 2015

Test build #40889 has finished for PR 8187 at commit d4806ab.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member Author

Merging with master and branch-1.5

asfgit pushed a commit that referenced this pull request Aug 14, 2015
This modifies DecisionTreeMetadata construction to treat 1-category features as continuous, so that trees do not fail with such features.  It is important for the pipelines API, where VectorIndexer can automatically categorize certain features as categorical.

As stated in the JIRA, this is a temp fix which we can improve upon later by automatically filtering out those features. That will take longer, though, since it will require careful indexing.

Targeted for 1.5 and master

CC: manishamde  mengxr yanboliang

Author: Joseph K. Bradley <joseph@databricks.com>

Closes #8187 from jkbradley/tree-1cat.

(cherry picked from commit 7ecf0c4)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
@asfgit asfgit closed this in 7ecf0c4 Aug 14, 2015
CodingCat pushed a commit to CodingCat/spark that referenced this pull request Aug 17, 2015
This modifies DecisionTreeMetadata construction to treat 1-category features as continuous, so that trees do not fail with such features.  It is important for the pipelines API, where VectorIndexer can automatically categorize certain features as categorical.

As stated in the JIRA, this is a temp fix which we can improve upon later by automatically filtering out those features. That will take longer, though, since it will require careful indexing.

Targeted for 1.5 and master

CC: manishamde  mengxr yanboliang

Author: Joseph K. Bradley <joseph@databricks.com>

Closes apache#8187 from jkbradley/tree-1cat.
@jkbradley jkbradley deleted the tree-1cat branch August 17, 2015 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants