-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix base-inherited behavior of variance() on matrices #55
Comments
I don't mind adding a I'm happy for you to work on a PR for this. |
make variance() return variance (not covariance) on matrices, closes #55
I'm preparing a release for this now and wanted to get your opinion on the appropriateness of this change in relation to multivariate distributions. The current behaviour is: library(distributional)
ux <- c(1, 2)
mx <- matrix(c(0,3,1,4), nrow = 2)
uv <- dist_normal(0,1)
mv <- dist_multivariate_normal(mu = list(c(1,2)), sigma = list(matrix(c(4,2,2,3), ncol=2)))
dimnames(mv) <- c("a", "b")
mean(ux)
#> [1] 1.5
mean(mx)
#> [1] 2
mean(uv)
#> [1] 0
mean(mv)
#> a b
#> [1,] 1 2
variance(ux)
#> [1] 0.5
variance(mx)
#> [1] 3.333333
variance(uv)
#> [1] 1
variance(mv)
#> [[1]]
#> [,1] [,2]
#> [1,] 4 2
#> [2,] 2 3 Created on 2021-10-04 by the reprex package (v2.0.0) I'm now considering changing the multivariate distribution's variance(mv)
#> a b
#> [1,] 4 3 A new generic, Does this sound reasonable? The other question would be what |
Yeah, I agree --- I think for a multivariate normal I would expect |
Thanks :) Here's what I've got so far for library(distributional)
ux <- c(1, 2)
mx <- matrix(c(0,3,1,4), nrow = 2)
uv <- dist_normal(0,1)
mv <- dist_multivariate_normal(mu = list(c(1,2)), sigma = list(matrix(c(4,2,2,3), ncol=2)))
dimnames(mv) <- c("a", "b")
mean(ux)
#> [1] 1.5
mean(mx)
#> [1] 2
mean(uv)
#> [1] 0
mean(mv)
#> a b
#> [1,] 1 2
variance(ux)
#> [1] 0.5
variance(mx)
#> [1] 3.333333
variance(uv)
#> [1] 1
variance(mv)
#> a b
#> [1,] 4 3
covariance(ux)
#> Error in stats::cov(x, ...): supply both 'x' and 'y' or a matrix-like 'x'
covariance(mx)
#> [,1] [,2]
#> [1,] 4.5 4.5
#> [2,] 4.5 4.5
covariance(uv)
#> [1] 1
covariance(mv)
#> [[1]]
#> [,1] [,2]
#> [1,] 4 2
#> [2,] 2 3 Created on 2021-10-11 by the reprex package (v2.0.0) |
Looks good to me! |
The
variance()
generic inherits a misfeature of base-Rvar()
in that for matrices it returns a covariance matrix. This is a misfeature in my opinion as it means that (1)var()
does not parallelsd()
; (2)var()
returns a different type of output depending on what it guesses the caller's intent is rather than just providing a consistent API; and (3)cov()
returns the covariance matrix anyway so there is no need forvar()
to do so. Becausevariance()
delegates tovar()
in the default case, it also has this behavior.For example:
I would expect
variance()
on a matrix to behave much likesd()
does; i.e. return a single value the same as it does on a vector.We ran into this problem in {posterior} as we would like people to be able to run summary functions over posterior samples that are stored in matrices (see stan-dev/posterior#121). Since obviously we can't fix base
var()
, and we already have a dependency on {distributional}, we were hoping to be able to use distributional'svariance()
for this purpose.I think the fix should be straightforward, either by changing this function definition:
distributional/R/distribution.R
Lines 220 to 223 in ab2fd9e
to something like this:
Or by adding a function definition like this:
Are either of those changes something you'd be willing to have in distributional? If so I'd be happy to submit a PR for whichever solution you prefer.
The text was updated successfully, but these errors were encountered: