Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-19395][SparkR]Convert coefficients in summary to matrix #16730

Closed
wants to merge 4 commits into from

Conversation

actuaryzhang
Copy link
Contributor

What changes were proposed in this pull request?

The coefficients component in model summary should be 'matrix' but the underlying structure is indeed list. This affects several models except for 'AFTSurvivalRegressionModel' which has the correct implementation. The fix is to first unlist the coefficients returned from the callJMethod before converting to matrix. An example illustrates the issues:

data(iris)
df <- createDataFrame(iris)
model <- spark.glm(df, Sepal_Length ~ Sepal_Width, family = "gaussian")
s <- summary(model)

> str(s$coefficients)
List of 8
 $ : num 6.53
 $ : num -0.223
 $ : num 0.479
 $ : num 0.155
 $ : num 13.6
 $ : num -1.44
 $ : num 0
 $ : num 0.152
 - attr(*, "dim")= int [1:2] 2 4
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "(Intercept)" "Sepal_Width"
  ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
> s$coefficients[, 2]
$`(Intercept)`
[1] 0.4788963

$Sepal_Width
[1] 0.1550809

This shows that the underlying structure of coefficients is still list.

@felixcheung @wangmiao1981

@SparkQA
Copy link

SparkQA commented Jan 28, 2017

Test build #72113 has finished for PR 16730 at commit c80d530.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

@actuaryzhang since this changes the output, could you open a JIRA for this?

@felixcheung
Copy link
Member

thanks, i'd be great if you could add some tests for this - perhaps just to check the correct type and/or a subset of values

@actuaryzhang actuaryzhang changed the title [Minor][SparkR]Convert coefficients in summary to matrix [SPARK-19395][SparkR]Convert coefficients in summary to matrix Jan 29, 2017
@actuaryzhang
Copy link
Contributor Author

@felixcheung I created a JIRA ticket and added in some tests. Please take a look. Thanks.

@SparkQA
Copy link

SparkQA commented Jan 29, 2017

Test build #72129 has finished for PR 16730 at commit 7ded346.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

merged to master, thanks!

@asfgit asfgit closed this in ce112ce Jan 31, 2017
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 15, 2017
## What changes were proposed in this pull request?
The `coefficients` component in model summary should be 'matrix' but the underlying structure is indeed list. This affects several models except for 'AFTSurvivalRegressionModel' which has the correct implementation. The fix is to first `unlist` the coefficients returned from the `callJMethod` before converting to matrix. An example illustrates the issues:

```
data(iris)
df <- createDataFrame(iris)
model <- spark.glm(df, Sepal_Length ~ Sepal_Width, family = "gaussian")
s <- summary(model)

> str(s$coefficients)
List of 8
 $ : num 6.53
 $ : num -0.223
 $ : num 0.479
 $ : num 0.155
 $ : num 13.6
 $ : num -1.44
 $ : num 0
 $ : num 0.152
 - attr(*, "dim")= int [1:2] 2 4
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "(Intercept)" "Sepal_Width"
  ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
> s$coefficients[, 2]
$`(Intercept)`
[1] 0.4788963

$Sepal_Width
[1] 0.1550809
```

This  shows that the underlying structure of coefficients is still `list`.

felixcheung wangmiao1981

Author: actuaryzhang <actuaryzhang10@gmail.com>

Closes apache#16730 from actuaryzhang/sparkRCoef.
@actuaryzhang actuaryzhang deleted the sparkRCoef branch July 1, 2017 00:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants