Skip to content

Conversation

@sethah
Copy link
Contributor

@sethah sethah commented Mar 25, 2017

What changes were proposed in this pull request?

Use the new compressed method on matrices to store the logistic regression coefficients as sparse or dense - whichever is requires less memory.

Marked as WIP so we can add some performance test results. Basically, we should see if prediction is slower because of using a sparse matrix over a dense one. This can happen since sparse matrices do not use native BLAS operations when computing the margins.

How was this patch tested?

Unit tests added.

@SparkQA
Copy link

SparkQA commented Mar 25, 2017

Test build #75205 has finished for PR 17426 at commit c677696.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dbtsai dbtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Merged into master. Thanks.

@@ -617,26 +612,13 @@ class LogisticRegression @Since("1.2.0") (
denseCoefficientMatrix.update(_ - coefficientMean)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, we could compress the coefficients from driver to executors to save communication cost. Can do in a separate PR.

@asfgit asfgit closed this in be85245 Mar 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants