Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26787] Fix standardizeLabels error message in WeightedLeastSquares #23705

Closed
wants to merge 2 commits into from

Conversation

bscan
Copy link
Contributor

@bscan bscan commented Jan 30, 2019

Error message falsely states standardization=True is causing a problem, even when standardization=False. The real issue is standardizeLabels=True, which is set automatically in LinearRegression and not currently available in the Public API.

What changes were proposed in this pull request?

A simple change to an error message. More details here: https://jira.apache.org/jira/browse/SPARK-26787

How was this patch tested?

This does not change any functionality.

Error message falsely states standardization=True is causing a problem, even when standardization=False. The real issue is standardizeLabels=True, which is set automatically in LinearRegression and not currently available in the Public API.
@HyukjinKwon
Copy link
Member

Can you fix the PR title to be more specific?

@@ -133,7 +133,8 @@ private[ml] class WeightedLeastSquares(
return new WeightedLeastSquaresModel(coefficients, intercept, diagInvAtWA, Array(0D))
} else {
require(!(regParam > 0.0 && standardizeLabel), "The standard deviation of the label is " +
"zero. Model cannot be regularized with standardization=true")
"zero. Model cannot be regularized with standardizeLabel=true " +
"(standardizeLabel is not exposed in the LinearRegression API)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second line isn't meaningful to the caller. I'd just say "Model cannot be regularized when labels are standardized".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Since the primary way to get here is through calling LinearRegression(), I was trying to give the user a message that indicates it's a current limitation of the API, not the implementation/algorithm. I updated to your suggestion though as I do think it's an improvement over the current message of standardization=true, which was simply incorrect. Thanks for reviewing!

@bscan bscan changed the title [SPARK-26787] Mistake in error message. [SPARK-26787] Fix standardizeLabels error message in WeightedLeastSquares Jan 31, 2019
Updating based on code review.
@SparkQA
Copy link

SparkQA commented Jan 31, 2019

Test build #4540 has finished for PR 23705 at commit 9f50960.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Feb 1, 2019

Merged to master

@srowen srowen closed this in e44f308 Feb 1, 2019
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…ares

Error message falsely states standardization=True is causing a problem, even when standardization=False. The real issue is standardizeLabels=True, which is set automatically in LinearRegression and not currently available in the Public API.

## What changes were proposed in this pull request?

A simple change to an error message. More details here: https://jira.apache.org/jira/browse/SPARK-26787

## How was this patch tested?

This does not change any functionality.

Closes apache#23705 from bscan/bscan-errormsg-1.

Authored-by: bscan <brianjscannell@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants