[SPARK-20687][MLLIB] mllib.Matrices.fromBreeze may crash when converting from Breeze sparse matrix #17940

ghoto · 2017-05-10T18:24:27Z

What changes were proposed in this pull request?

When two Breeze SparseMatrices are operated, the result matrix may contain provisional 0 values extra in rowIndices and data arrays. This causes an incoherence with the colPtrs data, but Breeze get away with this incoherence by keeping a counter of the valid data.

In spark, when this matrices are converted to SparseMatrices, Sparks relies solely on rowIndices, data, and colPtrs, but these might be incorrect because of breeze internal hacks. Therefore, we need to slice both rowIndices and data, using their counter of active data

This method is at least called by BlockMatrix when performing distributed block operations, causing exceptions on valid operations.

See http://stackoverflow.com/questions/33528555/error-thrown-when-using-blockmatrix-add

How was this patch tested?

Added a test to MatricesSuite that verifies that the conversions are valid and that code doesn't crash. Originally the same code would crash on Spark.

Bugfix for https://issues.apache.org/jira/browse/SPARK-20687

…ng breeze CSCMatrix In an operation of two A, B CSCMatrices the resulting C matrix may have some extra 0s in rowIndices and data which are created for performance improvement by Breeze. This causes problems on converting back to mllib.Matrix because it relies on rowIndices and data being coherent with colPtrs. Therefore it is necessary to truncate rowIndices and data to the active number of elements hold by the C matrix, before creating a Spark's SparseMatrix.

srowen · 2017-05-10T18:50:11Z

Please fix up the title and description per http://spark.apache.org/contributing.html

ghoto · 2017-05-11T17:32:09Z

Sorry about that. I added more context in the description and updated the title.

srowen · 2017-05-11T17:35:50Z

CC @hhbyyh for #9520
Looks credible to me.

SparkQA · 2017-05-11T17:39:09Z

Test build #3710 has finished for PR 17940 at commit dbbd391.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

ghoto · 2017-05-11T19:52:13Z

Need to fix line in the test because it's too long.

hhbyyh

Thanks for the PR. also cc @yanboliang . Looks like the issue is still present.

hhbyyh · 2017-05-12T01:45:42Z

mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala

+        // despite sm being a valid CSCMatrix.
+        // We need to truncate both arrays (rowIndices, data)
+        // to the real size of the vector sm.activeSize to allow valid conversion
+


Maybe we can add if (sm.activeSize != sm.rowIndices.length) here, since this is only needed when necessary.
Please refer to https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/linalg/CSCMatrix.scala#L130

hhbyyh · 2017-05-12T01:47:07Z

mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala

+        // to the real size of the vector sm.activeSize to allow valid conversion
+
+        val truncRowIndices = sm.rowIndices.slice(0, sm.activeSize)
+        val truncData = sm.data.slice(0, sm.activeSize)


This is the same as calling compact(). To make it less sensitive to the breeze internal implementation, how about:

val matCopy = sm.copy matCopy.compact()

I'm implementing both suggestions, however, wouldn't be the sm.copy more expensive than just doing those 2 slices?

hhbyyh · 2017-05-12T01:52:00Z

mllib/src/test/scala/org/apache/spark/mllib/linalg/MatricesSuite.scala

@@ -46,6 +46,26 @@ class MatricesSuite extends SparkFunSuite {
    }
  }

+  test("Test Breeze Conversion Bug - SPARK-20687") {


specific name: Test FromBreeze when Breeze.CSCMatrix.rowIndices has trailing zeros.

And move the test after another unit test "fromBreeze with sparse matrix"

hhbyyh

Thanks for the update. Looks good to me.

hhbyyh · 2017-05-13T06:27:51Z

mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala

@@ -992,7 +992,24 @@ object Matrices {
        new DenseMatrix(dm.rows, dm.cols, dm.data, dm.isTranspose)
      case sm: BSM[Double] =>
        // There is no isTranspose flag for sparse matrices in Breeze
-        new SparseMatrix(sm.rows, sm.cols, sm.colPtrs, sm.rowIndices, sm.data)
+
+        // Some Breeze CSCMatrices may have extra trailing zeros in


minor: make this comment more compact in fewer lines.

SparkQA · 2017-05-16T10:16:54Z

Test build #3714 has started for PR 17940 at commit b40a706.

SparkQA · 2017-05-19T06:45:56Z

Test build #3727 has started for PR 17940 at commit b40a706.

SparkQA · 2017-05-19T12:34:24Z

Test build #3733 has finished for PR 17940 at commit b40a706.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2017-05-19T13:29:46Z

mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala

+          smC.compact()
+          new SparseMatrix(smC.rows, smC.cols, smC.colPtrs, smC.rowIndices, smC.data)
+        } else {
+          new SparseMatrix(sm.rows, sm.cols, sm.colPtrs, sm.rowIndices, sm.data)


@hhbyyh what do you think of the current state? I wasn't clear if you were requesting a specific change to the comment.

Here, if you're going to change it again, you could simplify by not repeating the new SparseMatrix(...) call. Just pick which sparse matrix you're copying in the if statement (original or compacted) and then return the result of converting that.

Hey, I shortened the comment and I removed repeating the new SparseMatrix.

Give it a look now.

hhbyyh

LGTM

hhbyyh · 2017-05-19T23:22:14Z

mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala

@@ -992,7 +992,16 @@ object Matrices {
        new DenseMatrix(dm.rows, dm.cols, dm.data, dm.isTranspose)
      case sm: BSM[Double] =>
        // There is no isTranspose flag for sparse matrices in Breeze
-        new SparseMatrix(sm.rows, sm.cols, sm.colPtrs, sm.rowIndices, sm.data)
+        val nsm = if (sm.rowIndices.length > sm.activeSize) {
+          // This sparse matrix has trainling zeros.


SparkQA · 2017-05-21T18:12:59Z

Test build #3746 has finished for PR 17940 at commit 18ce388.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…ing from Breeze sparse matrix ## What changes were proposed in this pull request? When two Breeze SparseMatrices are operated, the result matrix may contain provisional 0 values extra in rowIndices and data arrays. This causes an incoherence with the colPtrs data, but Breeze get away with this incoherence by keeping a counter of the valid data. In spark, when this matrices are converted to SparseMatrices, Sparks relies solely on rowIndices, data, and colPtrs, but these might be incorrect because of breeze internal hacks. Therefore, we need to slice both rowIndices and data, using their counter of active data This method is at least called by BlockMatrix when performing distributed block operations, causing exceptions on valid operations. See http://stackoverflow.com/questions/33528555/error-thrown-when-using-blockmatrix-add ## How was this patch tested? Added a test to MatricesSuite that verifies that the conversions are valid and that code doesn't crash. Originally the same code would crash on Spark. Bugfix for https://issues.apache.org/jira/browse/SPARK-20687 Author: Ignacio Bermudez <ignaciobermudez@gmail.com> Author: Ignacio Bermudez Corrales <icorrales@splunk.com> Closes #17940 from ghoto/bug-fix/SPARK-20687. (cherry picked from commit 06dda1d) Signed-off-by: Sean Owen <sowen@cloudera.com>

srowen · 2017-05-22T09:28:19Z

Merged to master/2.2/2.1

…ing from Breeze sparse matrix ## What changes were proposed in this pull request? When two Breeze SparseMatrices are operated, the result matrix may contain provisional 0 values extra in rowIndices and data arrays. This causes an incoherence with the colPtrs data, but Breeze get away with this incoherence by keeping a counter of the valid data. In spark, when this matrices are converted to SparseMatrices, Sparks relies solely on rowIndices, data, and colPtrs, but these might be incorrect because of breeze internal hacks. Therefore, we need to slice both rowIndices and data, using their counter of active data This method is at least called by BlockMatrix when performing distributed block operations, causing exceptions on valid operations. See http://stackoverflow.com/questions/33528555/error-thrown-when-using-blockmatrix-add ## How was this patch tested? Added a test to MatricesSuite that verifies that the conversions are valid and that code doesn't crash. Originally the same code would crash on Spark. Bugfix for https://issues.apache.org/jira/browse/SPARK-20687 Author: Ignacio Bermudez <ignaciobermudez@gmail.com> Author: Ignacio Bermudez Corrales <icorrales@splunk.com> Closes apache#17940 from ghoto/bug-fix/SPARK-20687.

…ing from Breeze sparse matrix When two Breeze SparseMatrices are operated, the result matrix may contain provisional 0 values extra in rowIndices and data arrays. This causes an incoherence with the colPtrs data, but Breeze get away with this incoherence by keeping a counter of the valid data. In spark, when this matrices are converted to SparseMatrices, Sparks relies solely on rowIndices, data, and colPtrs, but these might be incorrect because of breeze internal hacks. Therefore, we need to slice both rowIndices and data, using their counter of active data This method is at least called by BlockMatrix when performing distributed block operations, causing exceptions on valid operations. See http://stackoverflow.com/questions/33528555/error-thrown-when-using-blockmatrix-add Added a test to MatricesSuite that verifies that the conversions are valid and that code doesn't crash. Originally the same code would crash on Spark. Bugfix for https://issues.apache.org/jira/browse/SPARK-20687 Author: Ignacio Bermudez <ignaciobermudez@gmail.com> Author: Ignacio Bermudez Corrales <icorrales@splunk.com> Closes apache#17940 from ghoto/bug-fix/SPARK-20687. (cherry picked from commit 06dda1d) Signed-off-by: Sean Owen <sowen@cloudera.com>

ghoto added 2 commits May 9, 2017 21:31

Reproducing SPARK-20687

62d78a2

ghoto changed the title ~~Bug fix/spark 20687~~ [SPARK-20687][MLLIB] mllib.Matrices.fromBreeze may crash when converting from Breeze sparse matrix May 11, 2017

Line too long in test

df84eb9

hhbyyh reviewed May 12, 2017

View reviewed changes

Addressing suggestions of @hhbyyh

b40a706

hhbyyh reviewed May 13, 2017

View reviewed changes

srowen reviewed May 19, 2017

View reviewed changes

Shortening description+cosmetic

bc8e14c

srowen approved these changes May 20, 2017

View reviewed changes

hhbyyh reviewed May 20, 2017

View reviewed changes

Comment typo

18ce388

hhbyyh approved these changes May 20, 2017

View reviewed changes

asfgit closed this in 06dda1d May 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-20687][MLLIB] mllib.Matrices.fromBreeze may crash when converting from Breeze sparse matrix #17940

[SPARK-20687][MLLIB] mllib.Matrices.fromBreeze may crash when converting from Breeze sparse matrix #17940

ghoto commented May 10, 2017 •

edited

Loading

srowen commented May 10, 2017

ghoto commented May 11, 2017

srowen commented May 11, 2017

SparkQA commented May 11, 2017

ghoto commented May 11, 2017

hhbyyh left a comment •

edited

Loading

hhbyyh May 12, 2017 •

edited

Loading

hhbyyh May 12, 2017 •

edited

Loading

ghoto May 12, 2017

hhbyyh May 12, 2017

hhbyyh left a comment

hhbyyh May 13, 2017 •

edited

Loading

SparkQA commented May 16, 2017

SparkQA commented May 19, 2017

SparkQA commented May 19, 2017

srowen May 19, 2017

ghoto May 19, 2017

hhbyyh left a comment

hhbyyh May 19, 2017

ghoto May 20, 2017

SparkQA commented May 21, 2017

srowen commented May 22, 2017

[SPARK-20687][MLLIB] mllib.Matrices.fromBreeze may crash when converting from Breeze sparse matrix #17940

[SPARK-20687][MLLIB] mllib.Matrices.fromBreeze may crash when converting from Breeze sparse matrix #17940

Conversation

ghoto commented May 10, 2017 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

srowen commented May 10, 2017

ghoto commented May 11, 2017

srowen commented May 11, 2017

SparkQA commented May 11, 2017

ghoto commented May 11, 2017

hhbyyh left a comment • edited Loading

Choose a reason for hiding this comment

hhbyyh May 12, 2017 • edited Loading

Choose a reason for hiding this comment

hhbyyh May 12, 2017 • edited Loading

Choose a reason for hiding this comment

ghoto May 12, 2017

Choose a reason for hiding this comment

hhbyyh May 12, 2017

Choose a reason for hiding this comment

hhbyyh left a comment

Choose a reason for hiding this comment

hhbyyh May 13, 2017 • edited Loading

Choose a reason for hiding this comment

SparkQA commented May 16, 2017

SparkQA commented May 19, 2017

SparkQA commented May 19, 2017

srowen May 19, 2017

Choose a reason for hiding this comment

ghoto May 19, 2017

Choose a reason for hiding this comment

hhbyyh left a comment

Choose a reason for hiding this comment

hhbyyh May 19, 2017

Choose a reason for hiding this comment

ghoto May 20, 2017

Choose a reason for hiding this comment

SparkQA commented May 21, 2017

srowen commented May 22, 2017

ghoto commented May 10, 2017 •

edited

Loading

hhbyyh left a comment •

edited

Loading

hhbyyh May 12, 2017 •

edited

Loading

hhbyyh May 12, 2017 •

edited

Loading

hhbyyh May 13, 2017 •

edited

Loading