Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-14549][ML] Copy the Vector and Matrix classes from mllib to ml in mllib-local #12317

Closed
wants to merge 14 commits into from

Conversation

dbtsai
Copy link
Member

@dbtsai dbtsai commented Apr 12, 2016

What changes were proposed in this pull request?

This task will copy the Vector and Matrix classes from mllib to ml package in mllib-local jar. The UDTs and since annotation in ml vector and matrix will be removed from now. UDTs will be achieved by #SPARK-14487, and since will be replaced by /* @ since 1.2.0 */

The BLAS implementation will be copied, and some of the test utilities will be copies as well.

Summary of changes:

  1. In mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala
    • Copied from mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala
    • logDebug("gemm: alpha is equal to 0 and beta is equal to 1. Returning C.") is removed in ml version.
  2. In mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala
  3. In mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala
    • Copied from mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala
    • @Since was removed.
    • UDT related code was removed.
    • In def parseNumeric, it was throwing throw new SparkException(s"Cannot parse $other."), and now it's throwing throw new IllegalArgumentException(s"Cannot parse $other.")
  4. In mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala
    • For consistency with ML version of vector, def parseNumeric is now throwing throw new IllegalArgumentException(s"Cannot parse $other.")
  5. mllib/src/main/scala/org/apache/spark/mllib/util/NumericParser.scala is moved to mllib-local/src/main/scala/org/apache/spark/ml/util/NumericParser.scala
    • All the throw new SparkException were replaced by throw new IllegalArgumentException

How was this patch tested?

unit tests

@SparkQA
Copy link

SparkQA commented Apr 12, 2016

Test build #55571 has finished for PR 12317 at commit 5795d54.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DenseMatrix (
    • class SparseMatrix (
    • class DenseVector (val values: Array[Double]) extends Vector
    • class SparseVector (

@SparkQA
Copy link

SparkQA commented Apr 12, 2016

Test build #55574 has finished for PR 12317 at commit 1fba791.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 12, 2016

Test build #55589 has finished for PR 12317 at commit f9473f5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Apr 12, 2016

@dbtsai Please remove WIP from title when this is ready for review:)

@dbtsai dbtsai changed the title [SPARK-14549][ML][WIP] Copy the Vector and Matrix classes from mllib to ml in mllib-local [SPARK-14549][ML] Copy the Vector and Matrix classes from mllib to ml in mllib-local Apr 13, 2016
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_${scala.binary.version}</artifactId>
<version>3.2.10</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core also depends on json4s (See https://github.com/apache/spark/blob/master/core/pom.xml#L195.). Shall we move the version definition to a common place?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another JIRA to do so. We also want to consolidate the version of breeze. https://issues.apache.org/jira/browse/SPARK-14612

If you want, I can do it in this PR.

@SparkQA
Copy link

SparkQA commented Apr 13, 2016

Test build #55752 has finished for PR 12317 at commit 69f6982.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 14, 2016

Test build #55755 has finished for PR 12317 at commit 38914b2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 14, 2016

Test build #55760 has finished for PR 12317 at commit 45a0ee8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 14, 2016

Test build #55766 has finished for PR 12317 at commit 150a60b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 14, 2016

Test build #55827 has finished for PR 12317 at commit 35e2da5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dbtsai
Copy link
Member Author

dbtsai commented Apr 14, 2016

@mengxr The versions of the dependencies have been moved to the parent POM.

@dbtsai
Copy link
Member Author

dbtsai commented Apr 14, 2016

+cc @srowen @MLnick

@mengxr
Copy link
Contributor

mengxr commented Apr 15, 2016

I'm making another pass.

@@ -49,6 +49,10 @@
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_${scala.binary.version}</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created https://issues.apache.org/jira/browse/SPARK-14653 as a follow-up item.

@SparkQA
Copy link

SparkQA commented Apr 15, 2016

Test build #55902 has finished for PR 12317 at commit e626d1d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Apr 15, 2016

LGTM. Merged into master. Thanks!

@asfgit asfgit closed this in 96534aa Apr 15, 2016
@dbtsai dbtsai deleted the dbtsai-ml-vector branch November 11, 2016 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants