Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

U and V #1

Closed
prockenschaub opened this issue Aug 24, 2021 · 6 comments
Closed

U and V #1

prockenschaub opened this issue Aug 24, 2021 · 6 comments

Comments

@prockenschaub
Copy link

prockenschaub commented Aug 24, 2021

First off, thank you very much for your package and the accompanying vignette. It helped me a lot in geting to grips with the matrix normal distribution!

I have been using your package to draw samples from the posterior of a multivariate linear regression, which follows an inverse-Wishart times a Matrix normal distribution (see for example wikipedia). While doing so, I stumbled across an inconsistency in the variance of those draws, and I think it is because U and V in the below code should be switched.

Sigma <- kronecker(U,V)

Example

Sorry for the slightly verbose example, but here is some code that implements what I meant to do. It compares the results obtained via matrixNormal (that differ from I expected) to those that I get using the matrixsampling package (which are in line with expectations).

library(tidyverse)
library(matrixNormal)
library(matrixsampling)
library(MCMCpack)

set.seed(42)

# Simulate example data for multivariate linear regression
X <- cbind(1, matrix(rnorm(2000), ncol = 2))
beta <- matrix(c(5, -1, 1, 0, 3, 2), ncol = 2)

Y <- cbind(
  X %*% beta[, 1] + rnorm(1000, sd = sqrt(2)),
  X %*% beta[, 2] + rnorm(1000, sd = sqrt(0.5))
)

# Fit model and get MLE solution
fit <- lm(Y ~ X[, -1]) # NOTE: remove intercept here because it's added by lm
C <- coefficients(fit)
S <- estVar(fit)
v <- df.residual(fit)

draw_from_model <- function(X, C, S, v, rmnorm) {
  #' Draw from the posterior of the model parameters
  #' 
  #' @param X Predictor matrix (n x k)
  #' @param C Matrix of estimated coefficients (k x m)
  #' @param S Matrix of estimated residual covariance (m x m)
  #' @param v Residual degrees of freedom (scalar)
  #' 
  #' @note
  #' P(Sigma) ~ inverse-Wishart(v, S)
  #' P(Beta | Sigma) ~ Matrix-Normal(C, (X'X)^-1, Sigma)
  Sigma <- MCMCpack::riwish(v, (v - ncol(S) - 1) * S)   
  Beta <- rmnorm(M = C, U = solve(t(X) %*% X), V = Sigma)
  Beta
}

# Estimated covariance of the coefficients (as obtained from the model)
vcov(fit)
#>               :(Intercept)     :X[, -1]1     :X[, -1]2  :(Intercept)
#> :(Intercept)  2.124127e-03  5.448705e-05  1.106835e-05 -2.028120e-05
#> :X[, -1]1     5.448705e-05  2.114322e-03 -2.145732e-05 -5.202434e-07
#> :X[, -1]2     1.106835e-05 -2.145732e-05  2.185500e-03 -1.056808e-07
#> :(Intercept) -2.028120e-05 -5.202434e-07 -1.056808e-07  4.891852e-04
#> :X[, -1]1    -5.202434e-07 -2.018759e-05  2.048748e-07  1.254834e-05
#> :X[, -1]2    -1.056808e-07  2.048748e-07 -2.086719e-05  2.549036e-06
#>                  :X[, -1]1     :X[, -1]2
#> :(Intercept) -5.202434e-07 -1.056808e-07
#> :X[, -1]1    -2.018759e-05  2.048748e-07
#> :X[, -1]2     2.048748e-07 -2.086719e-05
#> :(Intercept)  1.254834e-05  2.549036e-06
#> :X[, -1]1     4.869273e-04 -4.941608e-06
#> :X[, -1]2    -4.941608e-06  5.033195e-04

# Compare estimated covariances after drawing from the coefficient distribution
n <- 1000

f_matsamp <- function(M, U, V) {
  #' Wrapper around `matrixsampling::rmatrixnormal`
  as.vector(rmatrixnormal(n = 1, M, U, V))
}

d_matsamp <- replicate(n, draw_from_model(X, C, S, v, f_matsamp))
cov(t(d_matsamp))  # Similar sizes along the diagonal (first e-3, then e-4)
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.045558e-03  1.242998e-04 -6.396179e-05 -4.413909e-05  7.030552e-05
#> [2,]  1.242998e-04  2.134760e-03  5.721009e-05 -1.002274e-05 -9.579874e-06
#> [3,] -6.396179e-05  5.721009e-05  2.348186e-03 -6.256866e-06 -1.926543e-05
#> [4,] -4.413909e-05 -1.002274e-05 -6.256866e-06  4.941461e-04  1.924450e-05
#> [5,]  7.030552e-05 -9.579874e-06 -1.926543e-05  1.924450e-05  5.141899e-04
#> [6,] -1.345372e-05 -6.915185e-05 -1.033797e-04  4.295942e-05 -1.403097e-05
#>               [,6]
#> [1,] -1.345372e-05
#> [2,] -6.915185e-05
#> [3,] -1.033797e-04
#> [4,]  4.295942e-05
#> [5,] -1.403097e-05
#> [6,]  5.039882e-04


f_matnorm <- function (M, U, V) {
  #' Wrapper around `matrixNormal::rmatnorm`
  as.vector(rmatnorm(s = 1, M, U, V))
}

d_matnorm <- replicate(n, draw_from_model(X, C, S, v, f_matnorm))
cov(t(d_matnorm))  # Alternatingly different sizes along the diagonal 
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.210918e-03 -3.117320e-05  2.192985e-05 -2.377353e-05 -4.313487e-05
#> [2,] -3.117320e-05  4.972949e-04  1.826395e-05  3.037272e-06 -3.940224e-05
#> [3,]  2.192985e-05  1.826395e-05  2.070209e-03  3.201328e-05 -9.507253e-05
#> [4,] -2.377353e-05  3.037272e-06  3.201328e-05  4.894261e-04 -1.000038e-05
#> [5,] -4.313487e-05 -3.940224e-05 -9.507253e-05 -1.000038e-05  2.384154e-03
#> [6,] -1.579145e-05  1.837134e-05 -6.118079e-05 -9.955880e-06 -7.904825e-06
#>               [,6]
#> [1,] -1.579145e-05
#> [2,]  1.837134e-05
#> [3,] -6.118079e-05
#> [4,] -9.955880e-06
#> [5,] -7.904825e-06
#> [6,]  5.135833e-04

# Note that flipping U and V in the kronecker product gives the expected result
f_matnorm_flipped <- function (M, U, V) {
  Sigma <- kronecker(V, U) # <-- flipped
  as.vector(mvtnorm::rmvnorm(1, vec(M), Sigma, method = "chol"))
}

d_flipped <- replicate(n, draw_from_model(X, C, S, v, f_matnorm_flipped))
cov(t(d_flipped))
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.117118e-03 -3.051414e-05  3.065412e-06 -6.532210e-05 -3.199743e-05
#> [2,] -3.051414e-05  2.129165e-03 -2.290936e-05 -5.969029e-05  3.929485e-05
#> [3,]  3.065412e-06 -2.290936e-05  2.234542e-03 -2.753333e-05  3.361296e-05
#> [4,] -6.532210e-05 -5.969029e-05 -2.753333e-05  5.194137e-04  1.949865e-05
#> [5,] -3.199743e-05  3.929485e-05  3.361296e-05  1.949865e-05  5.168933e-04
#> [6,]  3.850787e-05  3.541797e-06  4.425072e-06  1.018581e-05  1.610612e-05
#>              [,6]
#> [1,] 3.850787e-05
#> [2,] 3.541797e-06
#> [3,] 4.425072e-06
#> [4,] 1.018581e-05
#> [5,] 1.610612e-05
#> [6,] 5.056230e-04

Created on 2021-08-24 by the reprex package (v2.0.0)

Edit: Updated example to include a random seed.

@phargarten2
Copy link
Owner

phargarten2 commented Nov 3, 2021

I am glad that you find the package useful in your work. Yours is an example of the reason why I started the package. If you haven't already, the README file I think describes what you are trying to do: https://cran.r-project.org/web/packages/matrixNormal/vignettes/introduction-to-matrixnormal-package.html

Thank you for letting me know about the possible issue and for your investigation. I apologize for the delay, but I had a lot going on. Certainly, matrixsampling::rmatrixnormal() defines U and V the opposite way that matrixNormal does. For matrixsampling::rmatrixnormal()`: (I commented out unimportant details)

function (n, M, U, V, checkSymmetry = TRUE, keep = TRUE){
#....
   Uroot <- matrixroot(U, matrixname = "U", symmetric = !checkSymmetry)
    Vroot <- matrixroot(V, matrixname = "V", symmetric = !checkSymmetry)
    m <- nrow(Uroot)
    p <- nrow(Vroot)
    M <- as.matrix(M)
#...
  out[, , i] <- M + Uroot %*% Z[, , i] %*% Vroot

}

So, in rmatrixnormal, the row covariance matrix V is m x m and column covariance matrix U is p x p, so M is a m x p matrix. They define the Kronecker as column, row. However, I can't seem to find a citation on their definitions for their U and V.

In matrixNormal, the row covariance matrix U is n x n and column covariance matrix V is p x p, so M is a n x p matrix. I double-checked the reference listed on the help file, and indeed, kronecker(U, V) (row, column) is correct.

Iranmanesh, Anis, M. Arashi, and S. M. M. Tabatabaey On Conditional Applications of Matrix Variate Normal Distribution. Iranian Journal of Mathematical Sciences and Informatics 5, no. 2. (November 1, 2010): 33-43. < https://doi.org/10.7508/ijmsi.2010.02.004 >

@phargarten2
Copy link
Owner

phargarten2 commented Nov 3, 2021

Upon investigating, the matrixsampling package uses an "eigen" decomposition. So for a fair comparison to matixsampling, you need to change the method to eigen. The differences noted above still exist, however.

library(tidyverse)
library(matrixNormal)
library(matrixsampling)
library(MCMCpack)
    set.seed(42)
    
    # Simulate example data for multivariate linear regression
    X <- cbind(1, matrix(rnorm(2000), ncol = 2))
    beta <- matrix(c(5, -1, 1, 0, 3, 2), ncol = 2)
    
    Y <- cbind(
        X %*% beta[, 1] + rnorm(1000, sd = sqrt(2)),
        X %*% beta[, 2] + rnorm(1000, sd = sqrt(0.5))
    )
    
    # Fit model and get MLE solution
    fit <- lm(Y ~ X[, -1]) # NOTE: remove intercept here because it's added by lm
    C <- coefficients(fit)
    S <- estVar(fit)
    v <- df.residual(fit)
    
    draw_from_model <- function(X, C, S, v, rmnorm) {
        #' Draw from the posterior of the model parameters
        #' 
        #' @param X Predictor matrix (n x k)
        #' @param C Matrix of estimated coefficients (k x m)
        #' @param S Matrix of estimated residual covariance (m x m)
        #' @param v Residual degrees of freedom (scalar)
        #' 
        #' @note
        #' P(Sigma) ~ inverse-Wishart(v, S)
        #' P(Beta | Sigma) ~ Matrix-Normal(C, (X'X)^-1, Sigma)
        Sigma <- MCMCpack::riwish(v, (v - ncol(S) - 1) * S)   
        Beta <- rmnorm(M = C, U = solve(t(X) %*% X), V = Sigma)
        Beta
    }
    
    # Compare estimated covariances after drawing from the coefficient distribution
    n <- 1000
    
    f_matsamp <- function(M, U, V) {
        #' Wrapper around `matrixsampling::rmatrixnormal`
        as.vector(matrixsampling::rmatrixnormal(n = 1, M, U, V))
    }
    
    d_matsamp <- replicate(n, draw_from_model(X, C, S, v, f_matsamp))
    cov_matsamp <- cov(t(d_matsamp))  
    
    
    f_matnorm <- function (M, U, V) {
        #' Wrapper around `matrixNormal::rmatnorm`
        as.vector(matrixNormal::rmatnorm(s = 1, M, U, V, method = "eigen"))          #method should be changed here
    }
    
    d_matnorm <- replicate(n, draw_from_model(X, C, S, v, f_matnorm))
    cov_matnorm <- cov(t(d_matnorm))  
    
    f_matnorm_flipped <- function (M, U, V) {
        Sigma <- kronecker(V, U) # <-- flipped
        as.vector(matrixNormal::rmatnorm(s = 1, M, U, V, method = "eigen"))  #method should be changed here
    }
    d_flipped <- replicate(n, draw_from_model(X, C, S, v, f_matnorm_flipped))
    cov_matnorm_flipped <- cov(t(d_flipped))
    
    cov_matsamp
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.045558e-03  1.242998e-04 -6.396179e-05 -4.413909e-05  7.030552e-05
#> [2,]  1.242998e-04  2.134760e-03  5.721009e-05 -1.002274e-05 -9.579874e-06
#> [3,] -6.396179e-05  5.721009e-05  2.348186e-03 -6.256866e-06 -1.926543e-05
#> [4,] -4.413909e-05 -1.002274e-05 -6.256866e-06  4.941461e-04  1.924450e-05
#> [5,]  7.030552e-05 -9.579874e-06 -1.926543e-05  1.924450e-05  5.141899e-04
#> [6,] -1.345372e-05 -6.915185e-05 -1.033797e-04  4.295942e-05 -1.403097e-05
#>               [,6]
#> [1,] -1.345372e-05
#> [2,] -6.915185e-05
#> [3,] -1.033797e-04
#> [4,]  4.295942e-05
#> [5,] -1.403097e-05
#> [6,]  5.039882e-04
    cov_matnorm
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.314735e-03 -4.868351e-05 -1.297305e-05  1.473154e-05 -7.012393e-05
#> [2,] -4.868351e-05  5.080696e-04 -1.632279e-05  4.435097e-06 -1.174082e-05
#> [3,] -1.297305e-05 -1.632279e-05  2.072238e-03 -8.130386e-05  1.886948e-05
#> [4,]  1.473154e-05  4.435097e-06 -8.130386e-05  4.991333e-04  3.156826e-05
#> [5,] -7.012393e-05 -1.174082e-05  1.886948e-05  3.156826e-05  2.228718e-03
#> [6,] -1.089046e-05 -7.687652e-06  5.080602e-05 -1.368645e-05 -3.867333e-05
#>               [,6]
#> [1,] -1.089046e-05
#> [2,] -7.687652e-06
#> [3,]  5.080602e-05
#> [4,] -1.368645e-05
#> [5,] -3.867333e-05
#> [6,]  5.092445e-04
    cov_matnorm_flipped
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.170470e-03 -2.527128e-05  5.424053e-05  2.596522e-05 -4.494087e-05
#> [2,] -2.527128e-05  4.900120e-04 -4.077981e-05  3.050041e-05 -4.268714e-05
#> [3,]  5.424053e-05 -4.077981e-05  2.134899e-03 -1.962760e-05 -1.483435e-04
#> [4,]  2.596522e-05  3.050041e-05 -1.962760e-05  4.899137e-04  1.777036e-05
#> [5,] -4.494087e-05 -4.268714e-05 -1.483435e-04  1.777036e-05  2.298639e-03
#> [6,]  3.335734e-05 -1.059047e-05  6.319208e-05  1.462154e-05 -1.359489e-05
#>               [,6]
#> [1,]  3.335734e-05
#> [2,] -1.059047e-05
#> [3,]  6.319208e-05
#> [4,]  1.462154e-05
#> [5,] -1.359489e-05
#> [6,]  5.341030e-04

Created on 2021-11-03 by the reprex package (v2.0.1)

@phargarten2
Copy link
Owner

BTW, thanks to your comments @prockenschaub , the help file for the argument methods in rmatnorm is confusing (there are two "defaults") and needs to be updated.

method	
String specifying the matrix decomposition used to determine the matrix root of the Kronecker product of U and V in rmatnorm. Possible methods are eigenvalue decomposition **("eigen", default),** singular value decomposition ("svd"), and Cholesky decomposition ("chol"). The Cholesky (the default) is typically fastest, but not by much though. Passed to rmvnorm.

As you can see in the function, "eigen" is not the default method and needs to be changed to "chol".

@prockenschaub
Copy link
Author

Thank you for taking the time to look into this! It's been a little since I worked with this, so hopefully I won't misremember :)

Having read through your explanations, I understand that U is an n-by-n matrix, where n is the number of rows, and V is a p-by-p matrix, where p is the number of columns. This makes sense and aligns e.g. with how the matrix normal distribution is described on Wikipedia. The disagreement then stems from whether vec(X) ~ Multivariate-Normal(vec(M), kronecker(U, V)) or vec(X) ~ Multivariate-Normal(vec(M), kronecker(V, U)), and it is surprisingly difficult to find a reliable reference for one or the other.

Before writing my earlier comment in August, I read your reference of Iranmanesh et al. (2010) and I agree with your conclusion that their definition suggests kronecker(U, V).

However, Wikipedia — citing work by Gupta and Nagar (1999) — suggest the opposite kronecker(V, U). Unfortunately, I do not have access to the original work by Gupta and Nagar, but other more recent work such as Pocuca et al. (2019) also use this definition (again citing Gupta and Nagar).

In light of the simulation results in my earlier post and the rather obscure journal in which Iranmanesh et al. (2010) was published, I tend to assume that the definiton in Iranmanesh et al. is incorrect and think instead kronecker(V, U) is the correct definition.

This also makes sense when thinking about the resulting covariance matrix. Using kronecker(V, U), the covariance of the first n entries of vec(X) (i.e., the first column p=1 of X) are represented together in the left-upper corner of the covariance matrix as v_11 * U, where v_11 is the scalar in V in row 1 and column 1 and which represents the variance of that column (in the multivariate linear regression in my earlier example, it is the residual variance of the first outcome variable). When kronecker(U, V) is used instead, no sensible covariance matrix arises since the upper left block would be u_11 * V, which can't be reasonably linked to vec(X).

@phargarten2
Copy link
Owner

Thank you, @prockenschaub. Your recent note clarifies many items for me. Thank you for bring the literature to my attention; I was unaware of Pocuca et al. (2019). Using an example from @stla, I think I see what you mean.

m <- 3  #subjects
p <- 2  #covariates 
M <- matrix(0, m, p)
U <- toeplitz(m:1); colnames(U) <- rownames(U) <- as.character(1:3)  #row covariance matrix
V <- toeplitz(p:1); colnames(V) <- rownames(V) <- c("beta0", "beta1") #column covariance matrix 

U
#>   1 2 3
#> 1 3 2 1
#> 2 2 3 2
#> 3 1 2 3

V
#>       beta0 beta1
#> beta0     2     1
#> beta1     1     2

#Covariance matrix of X, current 
kronecker(U, V, make.dimnames = TRUE)
#>         1:beta0 1:beta1 2:beta0 2:beta1 3:beta0 3:beta1
#> 1:beta0       6       3       4       2       2       1
#> 1:beta1       3       6       2       4       1       2
#> 2:beta0       4       2       6       3       4       2
#> 2:beta1       2       4       3       6       2       4
#> 3:beta0       2       1       4       2       6       3
#> 3:beta1       1       2       2       4       3       6

#Covariance matrix of X, makes sense, proposed
kronecker(V, U, make.dimnames = TRUE)
#>         beta0:1 beta0:2 beta0:3 beta1:1 beta1:2 beta1:3
#> beta0:1       6       4       2       3       2       1
#> beta0:2       4       6       4       2       3       2
#> beta0:3       2       4       6       1       2       3
#> beta1:1       3       2       1       6       4       2
#> beta1:2       2       3       2       4       6       4
#> beta1:3       1       2       3       2       4       6

Created on 2021-11-03 by the reprex package (v2.0.1)

The first makes no sense, while the second makes sense....especially since vec() takes the first row, then second, etc.

phargarten2 added a commit that referenced this issue Dec 30, 2021
# matrixNormal 0.1.0 2021-Dec-29

  - `pmatnorm`: Included argument `keepAttr` to pass to
    `mvtnorm::pmvtnorm()` v.1.1-2: logical allowing users to attach
    error and message to the return value. See .

  - `rmatnorm`:
    
      - Added argument `checkSymmetry = FALSE` to `mvtnorm::rmvnorm()`
        as the matrices are already checked for symmetry. See .
      - U and V are incorrectly switched, (thanks, @prockenschaub,
        <#1>)
          - Added warning to earlier versions of R
          - Changed reference cited.  
          - Made argument method clearer

  - Documentation clarification, thanks \#1

# matrixNormal 0.0.5 2021-Apr-1

## Bug Fixes

  - Minor Change: Changed lazy data to FALSE in description file to
    remove notes on CRAN check in R 4.0.5.

# matrixNormal 0.0.4 2020-08-26

  - New `vech()` function: performs half-vectorization on a symmetric
    matrix. This is the vector-form of the lower triangular matrix only.
    Unlike other functions on CRAN, vech() inherits any names from the
    matrix.

## Bug Fixes

  - `dmatrixnorm`: Clarified by replacing the name of argument `use.log`
    with `log` for consistency in argument name with mvtnorm and stats
    package.
  - `is.symmetric()`, `is.positive.definite()`,
    `is.positive.semi.definite()`:
      - if A is not symmetric, these functions NOW return FALSE instead
        of stopping the function. Restructured helper `find.eval()`.  
      - if A contains a missing value (`NA`), these functions NOW return
        NA.
  - `rmatrixnorm`: Added the first argument `s` to draw many random
    samples. Only 1 sample is still drawn; the argument currently has no
    effect but acts as a placeholder in future releases.
  - Clarified documentation.
  - Added session information and version details to the vignette. The
    updated versions of the packages listed do not affect the results of
    matrixNormal functions.
@phargarten2
Copy link
Owner

Updated in version 0.1.0 (#1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants