New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13010] [ML] [SparkR] Implement a simple wrapper of AFTSurvivalRegression in SparkR #11447
Conversation
#' model <- survreg(Surv(futime, fustat) ~ ecog_ps + rx, df) | ||
#' summary(model) | ||
#'} | ||
setMethod("survreg", signature(formula = "formula", data = "DataFrame"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only support "weibull" distribution in AFTSurvivalRegression
currently, so we don't need arguments dist
like R's survreg
until we supporting more distributions.
Test build #52246 has finished for PR 11447 at commit
|
@hhbyyh Could you help review this PR? |
Test build #52314 has finished for PR 11447 at commit
|
@yanboliang I tried your implementation with
and met the error
I can run the code in ut though
Is there something I'm missing. Thanks. |
@hhbyyh Thanks for your reviewing.
or
We will recommend the former one. |
Thanks. I'll make a pass more closely tomorrow. |
@yanboliang The |
Please check out this test - you could |
var censorCol: String = null | ||
|
||
val regex = "^Surv\\(([^,]+),([^,]+)\\)\\s*\\~\\s*(.+)".r | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
met MatchError during test with " Surv ( futime, fustat ) ~ ecog.ps + tx - rx". Maybe we should write the regex to the comment.
only some minor comments. |
@Since("2.0.0") | ||
class AFTSurvivalRegressionSummary private[regression] ( | ||
@Since("2.0.0") @transient val predictions: DataFrame, | ||
@Since("2.0.0") val predictionCol: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please help me understand where is the usage of predictionCol, labelCol featuresCol?
Test build #53431 has finished for PR 11447 at commit
|
Test build #53433 has finished for PR 11447 at commit
|
Test build #53434 has finished for PR 11447 at commit
|
@hhbyyh @mengxr The PR is ready for another pass. |
Test build #53528 has finished for PR 11447 at commit
|
Test build #53530 has finished for PR 11447 at commit
|
rCoefs <- as.vector(coef(rModel)) | ||
rScale <- rModel$scale | ||
|
||
expect_true(all(abs(rCoefs - coefs) < 1e-4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expect_equal(coefs, rCoefs, tolerance = 1e-4)
, which generates better error messages
@yanboliang I did some factoring of |
Let's move to #11932 |
…gression in SparkR ## What changes were proposed in this pull request? This PR continues the work in apache#11447, we implemented the wrapper of ```AFTSurvivalRegression``` named ```survreg``` in SparkR. ## How was this patch tested? Test against output from R package survival's survreg. cc mengxr felixcheung Close apache#11447 Author: Yanbo Liang <ybliang8@gmail.com> Closes apache#11932 from yanboliang/spark-13010-new.
What changes were proposed in this pull request?
Implement a simple wrapper of
AFTSurvivalRegression
namedsurvreg
in SparkR.cc @mengxr
How was this patch tested?
unit test.
Output of R:
Output of SparkR: