Add ordered split to RDD class #10795

drJAGartner · 2016-01-17T15:34:42Z

To perform supervised machine learning, one must partition data into labeled data sets, typically according to a set of rules. Creating an RDD function to perform this as a single step serves as a first step in moving toward single pass RDD partitioning.

AmplabJenkins · 2016-01-17T15:37:12Z

Can one of the admins verify this patch?

markhamstra · 2016-01-17T20:29:44Z

This needs to be implemented in Scala since Spark's architecture is such that the Scala implementation can be exposed via the Python, Java, R, etc. APIs and not the other way around, starting with one of those APIs as the root implementation.

JoshRosen · 2016-01-17T23:10:28Z

I don't think that we should add this to the core RDD API itself. This method is probably more useful / discoverable as part of one of the ml or mllib subpackages. In addition, the implementation here is not more efficient than what a user could write themselves (i.e. it's just syntactic sugar for ~ two lines of code).

If you'd still like to propose this change, please see the instructions at https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark and follow the process for filing a JIRA.

drJAGartner · 2016-01-17T23:55:10Z

Sounds good, thanks for the input.

Add ordered split to RDD class

41cf61e

drJAGartner closed this Jan 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ordered split to RDD class #10795

Add ordered split to RDD class #10795

Uh oh!

drJAGartner commented Jan 17, 2016

Uh oh!

AmplabJenkins commented Jan 17, 2016

Uh oh!

markhamstra commented Jan 17, 2016

Uh oh!

JoshRosen commented Jan 17, 2016

Uh oh!

drJAGartner commented Jan 17, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add ordered split to RDD class #10795

Add ordered split to RDD class #10795

Uh oh!

Conversation

drJAGartner commented Jan 17, 2016

Uh oh!

AmplabJenkins commented Jan 17, 2016

Uh oh!

markhamstra commented Jan 17, 2016

Uh oh!

JoshRosen commented Jan 17, 2016

Uh oh!

drJAGartner commented Jan 17, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants