-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-26787] Fix standardizeLabels error message in WeightedLeastSquares #23705
Conversation
Error message falsely states standardization=True is causing a problem, even when standardization=False. The real issue is standardizeLabels=True, which is set automatically in LinearRegression and not currently available in the Public API.
Can you fix the PR title to be more specific? |
@@ -133,7 +133,8 @@ private[ml] class WeightedLeastSquares( | |||
return new WeightedLeastSquaresModel(coefficients, intercept, diagInvAtWA, Array(0D)) | |||
} else { | |||
require(!(regParam > 0.0 && standardizeLabel), "The standard deviation of the label is " + | |||
"zero. Model cannot be regularized with standardization=true") | |||
"zero. Model cannot be regularized with standardizeLabel=true " + | |||
"(standardizeLabel is not exposed in the LinearRegression API)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second line isn't meaningful to the caller. I'd just say "Model cannot be regularized when labels are standardized".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Since the primary way to get here is through calling LinearRegression(), I was trying to give the user a message that indicates it's a current limitation of the API, not the implementation/algorithm. I updated to your suggestion though as I do think it's an improvement over the current message of standardization=true
, which was simply incorrect. Thanks for reviewing!
Updating based on code review.
Test build #4540 has finished for PR 23705 at commit
|
Merged to master |
…ares Error message falsely states standardization=True is causing a problem, even when standardization=False. The real issue is standardizeLabels=True, which is set automatically in LinearRegression and not currently available in the Public API. ## What changes were proposed in this pull request? A simple change to an error message. More details here: https://jira.apache.org/jira/browse/SPARK-26787 ## How was this patch tested? This does not change any functionality. Closes apache#23705 from bscan/bscan-errormsg-1. Authored-by: bscan <brianjscannell@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
Error message falsely states standardization=True is causing a problem, even when standardization=False. The real issue is standardizeLabels=True, which is set automatically in LinearRegression and not currently available in the Public API.
What changes were proposed in this pull request?
A simple change to an error message. More details here: https://jira.apache.org/jira/browse/SPARK-26787
How was this patch tested?
This does not change any functionality.