-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-16710] [SparkR] [ML] spark.glm should support weightCol #14346
Conversation
Test build #62824 has finished for PR 14346 at commit
|
cc @mengxr |
cc: @junyangq |
@@ -119,7 +121,7 @@ NULL | |||
#' @note spark.glm since 2.0.0 | |||
#' @seealso \link{glm}, \link{read.ml} | |||
setMethod("spark.glm", signature(data = "SparkDataFrame", formula = "formula"), | |||
function(data, formula, family = gaussian, tol = 1e-6, maxIter = 25) { | |||
function(data, formula, family = gaussian, weightCol = NULL, tol = 1e-6, maxIter = 25) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might not want to add a parameter in the middle of the list. if someone has existing code calling this function in parameter order it might misalign
Test build #63081 has finished for PR 14346 at commit
|
LGTM |
1 similar comment
LGTM |
LGTM. Merging this to master. @mengxr @felixcheung I didn't merge this into branch-2.0 as having Scala + R changes could affect the CRAN package we are building to match the 2.0 release. We can do a round of backports after that is done if required. |
What changes were proposed in this pull request?
Training GLMs on weighted dataset is very important use cases, but it is not supported by SparkR currently. Users can pass argument
weights
to specify the weights vector in native R. Forspark.glm
, we can pass in theweightCol
which is consistent with MLlib.How was this patch tested?
Unit test.