U and V #1

prockenschaub · 2021-08-24T10:09:21Z

First off, thank you very much for your package and the accompanying vignette. It helped me a lot in geting to grips with the matrix normal distribution!

I have been using your package to draw samples from the posterior of a multivariate linear regression, which follows an inverse-Wishart times a Matrix normal distribution (see for example wikipedia). While doing so, I stumbled across an inconsistency in the variance of those draws, and I think it is because U and V in the below code should be switched.

matrixNormal/R/matrixNormal_Distribution.R

Line 168 in 64132e0

Sigma <- kronecker(U,V)

Example

Sorry for the slightly verbose example, but here is some code that implements what I meant to do. It compares the results obtained via matrixNormal (that differ from I expected) to those that I get using the matrixsampling package (which are in line with expectations).

library(tidyverse)
library(matrixNormal)
library(matrixsampling)
library(MCMCpack)

set.seed(42)

# Simulate example data for multivariate linear regression
X <- cbind(1, matrix(rnorm(2000), ncol = 2))
beta <- matrix(c(5, -1, 1, 0, 3, 2), ncol = 2)

Y <- cbind(
  X %*% beta[, 1] + rnorm(1000, sd = sqrt(2)),
  X %*% beta[, 2] + rnorm(1000, sd = sqrt(0.5))
)

# Fit model and get MLE solution
fit <- lm(Y ~ X[, -1]) # NOTE: remove intercept here because it's added by lm
C <- coefficients(fit)
S <- estVar(fit)
v <- df.residual(fit)

draw_from_model <- function(X, C, S, v, rmnorm) {
  #' Draw from the posterior of the model parameters
  #' 
  #' @param X Predictor matrix (n x k)
  #' @param C Matrix of estimated coefficients (k x m)
  #' @param S Matrix of estimated residual covariance (m x m)
  #' @param v Residual degrees of freedom (scalar)
  #' 
  #' @note
  #' P(Sigma) ~ inverse-Wishart(v, S)
  #' P(Beta | Sigma) ~ Matrix-Normal(C, (X'X)^-1, Sigma)
  Sigma <- MCMCpack::riwish(v, (v - ncol(S) - 1) * S)   
  Beta <- rmnorm(M = C, U = solve(t(X) %*% X), V = Sigma)
  Beta
}

# Estimated covariance of the coefficients (as obtained from the model)
vcov(fit)
#>               :(Intercept)     :X[, -1]1     :X[, -1]2  :(Intercept)
#> :(Intercept)  2.124127e-03  5.448705e-05  1.106835e-05 -2.028120e-05
#> :X[, -1]1     5.448705e-05  2.114322e-03 -2.145732e-05 -5.202434e-07
#> :X[, -1]2     1.106835e-05 -2.145732e-05  2.185500e-03 -1.056808e-07
#> :(Intercept) -2.028120e-05 -5.202434e-07 -1.056808e-07  4.891852e-04
#> :X[, -1]1    -5.202434e-07 -2.018759e-05  2.048748e-07  1.254834e-05
#> :X[, -1]2    -1.056808e-07  2.048748e-07 -2.086719e-05  2.549036e-06
#>                  :X[, -1]1     :X[, -1]2
#> :(Intercept) -5.202434e-07 -1.056808e-07
#> :X[, -1]1    -2.018759e-05  2.048748e-07
#> :X[, -1]2     2.048748e-07 -2.086719e-05
#> :(Intercept)  1.254834e-05  2.549036e-06
#> :X[, -1]1     4.869273e-04 -4.941608e-06
#> :X[, -1]2    -4.941608e-06  5.033195e-04

# Compare estimated covariances after drawing from the coefficient distribution
n <- 1000

f_matsamp <- function(M, U, V) {
  #' Wrapper around `matrixsampling::rmatrixnormal`
  as.vector(rmatrixnormal(n = 1, M, U, V))
}

d_matsamp <- replicate(n, draw_from_model(X, C, S, v, f_matsamp))
cov(t(d_matsamp))  # Similar sizes along the diagonal (first e-3, then e-4)
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.045558e-03  1.242998e-04 -6.396179e-05 -4.413909e-05  7.030552e-05
#> [2,]  1.242998e-04  2.134760e-03  5.721009e-05 -1.002274e-05 -9.579874e-06
#> [3,] -6.396179e-05  5.721009e-05  2.348186e-03 -6.256866e-06 -1.926543e-05
#> [4,] -4.413909e-05 -1.002274e-05 -6.256866e-06  4.941461e-04  1.924450e-05
#> [5,]  7.030552e-05 -9.579874e-06 -1.926543e-05  1.924450e-05  5.141899e-04
#> [6,] -1.345372e-05 -6.915185e-05 -1.033797e-04  4.295942e-05 -1.403097e-05
#>               [,6]
#> [1,] -1.345372e-05
#> [2,] -6.915185e-05
#> [3,] -1.033797e-04
#> [4,]  4.295942e-05
#> [5,] -1.403097e-05
#> [6,]  5.039882e-04


f_matnorm <- function (M, U, V) {
  #' Wrapper around `matrixNormal::rmatnorm`
  as.vector(rmatnorm(s = 1, M, U, V))
}

d_matnorm <- replicate(n, draw_from_model(X, C, S, v, f_matnorm))
cov(t(d_matnorm))  # Alternatingly different sizes along the diagonal 
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.210918e-03 -3.117320e-05  2.192985e-05 -2.377353e-05 -4.313487e-05
#> [2,] -3.117320e-05  4.972949e-04  1.826395e-05  3.037272e-06 -3.940224e-05
#> [3,]  2.192985e-05  1.826395e-05  2.070209e-03  3.201328e-05 -9.507253e-05
#> [4,] -2.377353e-05  3.037272e-06  3.201328e-05  4.894261e-04 -1.000038e-05
#> [5,] -4.313487e-05 -3.940224e-05 -9.507253e-05 -1.000038e-05  2.384154e-03
#> [6,] -1.579145e-05  1.837134e-05 -6.118079e-05 -9.955880e-06 -7.904825e-06
#>               [,6]
#> [1,] -1.579145e-05
#> [2,]  1.837134e-05
#> [3,] -6.118079e-05
#> [4,] -9.955880e-06
#> [5,] -7.904825e-06
#> [6,]  5.135833e-04

# Note that flipping U and V in the kronecker product gives the expected result
f_matnorm_flipped <- function (M, U, V) {
  Sigma <- kronecker(V, U) # <-- flipped
  as.vector(mvtnorm::rmvnorm(1, vec(M), Sigma, method = "chol"))
}

d_flipped <- replicate(n, draw_from_model(X, C, S, v, f_matnorm_flipped))
cov(t(d_flipped))
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.117118e-03 -3.051414e-05  3.065412e-06 -6.532210e-05 -3.199743e-05
#> [2,] -3.051414e-05  2.129165e-03 -2.290936e-05 -5.969029e-05  3.929485e-05
#> [3,]  3.065412e-06 -2.290936e-05  2.234542e-03 -2.753333e-05  3.361296e-05
#> [4,] -6.532210e-05 -5.969029e-05 -2.753333e-05  5.194137e-04  1.949865e-05
#> [5,] -3.199743e-05  3.929485e-05  3.361296e-05  1.949865e-05  5.168933e-04
#> [6,]  3.850787e-05  3.541797e-06  4.425072e-06  1.018581e-05  1.610612e-05
#>              [,6]
#> [1,] 3.850787e-05
#> [2,] 3.541797e-06
#> [3,] 4.425072e-06
#> [4,] 1.018581e-05
#> [5,] 1.610612e-05
#> [6,] 5.056230e-04

^{Created on 2021-08-24 by the reprex package (v2.0.0)}

Edit: Updated example to include a random seed.

The text was updated successfully, but these errors were encountered:

phargarten2 · 2021-11-03T14:50:21Z

I am glad that you find the package useful in your work. Yours is an example of the reason why I started the package. If you haven't already, the README file I think describes what you are trying to do: https://cran.r-project.org/web/packages/matrixNormal/vignettes/introduction-to-matrixnormal-package.html

Thank you for letting me know about the possible issue and for your investigation. I apologize for the delay, but I had a lot going on. Certainly, matrixsampling::rmatrixnormal() defines U and V the opposite way that matrixNormal does. For matrixsampling::rmatrixnormal()`: (I commented out unimportant details)

function (n, M, U, V, checkSymmetry = TRUE, keep = TRUE){
#....
   Uroot <- matrixroot(U, matrixname = "U", symmetric = !checkSymmetry)
    Vroot <- matrixroot(V, matrixname = "V", symmetric = !checkSymmetry)
    m <- nrow(Uroot)
    p <- nrow(Vroot)
    M <- as.matrix(M)
#...
  out[, , i] <- M + Uroot %*% Z[, , i] %*% Vroot

}

So, in rmatrixnormal, the row covariance matrix V is m x m and column covariance matrix U is p x p, so M is a m x p matrix. They define the Kronecker as column, row. However, I can't seem to find a citation on their definitions for their U and V.

In matrixNormal, the row covariance matrix U is n x n and column covariance matrix V is p x p, so M is a n x p matrix. I double-checked the reference listed on the help file, and indeed, kronecker(U, V) (row, column) is correct.

Iranmanesh, Anis, M. Arashi, and S. M. M. Tabatabaey On Conditional Applications of Matrix Variate Normal Distribution. Iranian Journal of Mathematical Sciences and Informatics 5, no. 2. (November 1, 2010): 33-43. < https://doi.org/10.7508/ijmsi.2010.02.004 >

phargarten2 · 2021-11-03T15:33:08Z

Upon investigating, the matrixsampling package uses an "eigen" decomposition. So for a fair comparison to matixsampling, you need to change the method to eigen. The differences noted above still exist, however.

library(tidyverse)
library(matrixNormal)
library(matrixsampling)
library(MCMCpack)
    set.seed(42)
    
    # Simulate example data for multivariate linear regression
    X <- cbind(1, matrix(rnorm(2000), ncol = 2))
    beta <- matrix(c(5, -1, 1, 0, 3, 2), ncol = 2)
    
    Y <- cbind(
        X %*% beta[, 1] + rnorm(1000, sd = sqrt(2)),
        X %*% beta[, 2] + rnorm(1000, sd = sqrt(0.5))
    )
    
    # Fit model and get MLE solution
    fit <- lm(Y ~ X[, -1]) # NOTE: remove intercept here because it's added by lm
    C <- coefficients(fit)
    S <- estVar(fit)
    v <- df.residual(fit)
    
    draw_from_model <- function(X, C, S, v, rmnorm) {
        #' Draw from the posterior of the model parameters
        #' 
        #' @param X Predictor matrix (n x k)
        #' @param C Matrix of estimated coefficients (k x m)
        #' @param S Matrix of estimated residual covariance (m x m)
        #' @param v Residual degrees of freedom (scalar)
        #' 
        #' @note
        #' P(Sigma) ~ inverse-Wishart(v, S)
        #' P(Beta | Sigma) ~ Matrix-Normal(C, (X'X)^-1, Sigma)
        Sigma <- MCMCpack::riwish(v, (v - ncol(S) - 1) * S)   
        Beta <- rmnorm(M = C, U = solve(t(X) %*% X), V = Sigma)
        Beta
    }
    
    # Compare estimated covariances after drawing from the coefficient distribution
    n <- 1000
    
    f_matsamp <- function(M, U, V) {
        #' Wrapper around `matrixsampling::rmatrixnormal`
        as.vector(matrixsampling::rmatrixnormal(n = 1, M, U, V))
    }
    
    d_matsamp <- replicate(n, draw_from_model(X, C, S, v, f_matsamp))
    cov_matsamp <- cov(t(d_matsamp))  
    
    
    f_matnorm <- function (M, U, V) {
        #' Wrapper around `matrixNormal::rmatnorm`
        as.vector(matrixNormal::rmatnorm(s = 1, M, U, V, method = "eigen"))          #method should be changed here
    }
    
    d_matnorm <- replicate(n, draw_from_model(X, C, S, v, f_matnorm))
    cov_matnorm <- cov(t(d_matnorm))  
    
    f_matnorm_flipped <- function (M, U, V) {
        Sigma <- kronecker(V, U) # <-- flipped
        as.vector(matrixNormal::rmatnorm(s = 1, M, U, V, method = "eigen"))  #method should be changed here
    }
    d_flipped <- replicate(n, draw_from_model(X, C, S, v, f_matnorm_flipped))
    cov_matnorm_flipped <- cov(t(d_flipped))
    
    cov_matsamp
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.045558e-03  1.242998e-04 -6.396179e-05 -4.413909e-05  7.030552e-05
#> [2,]  1.242998e-04  2.134760e-03  5.721009e-05 -1.002274e-05 -9.579874e-06
#> [3,] -6.396179e-05  5.721009e-05  2.348186e-03 -6.256866e-06 -1.926543e-05
#> [4,] -4.413909e-05 -1.002274e-05 -6.256866e-06  4.941461e-04  1.924450e-05
#> [5,]  7.030552e-05 -9.579874e-06 -1.926543e-05  1.924450e-05  5.141899e-04
#> [6,] -1.345372e-05 -6.915185e-05 -1.033797e-04  4.295942e-05 -1.403097e-05
#>               [,6]
#> [1,] -1.345372e-05
#> [2,] -6.915185e-05
#> [3,] -1.033797e-04
#> [4,]  4.295942e-05
#> [5,] -1.403097e-05
#> [6,]  5.039882e-04
    cov_matnorm
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.314735e-03 -4.868351e-05 -1.297305e-05  1.473154e-05 -7.012393e-05
#> [2,] -4.868351e-05  5.080696e-04 -1.632279e-05  4.435097e-06 -1.174082e-05
#> [3,] -1.297305e-05 -1.632279e-05  2.072238e-03 -8.130386e-05  1.886948e-05
#> [4,]  1.473154e-05  4.435097e-06 -8.130386e-05  4.991333e-04  3.156826e-05
#> [5,] -7.012393e-05 -1.174082e-05  1.886948e-05  3.156826e-05  2.228718e-03
#> [6,] -1.089046e-05 -7.687652e-06  5.080602e-05 -1.368645e-05 -3.867333e-05
#>               [,6]
#> [1,] -1.089046e-05
#> [2,] -7.687652e-06
#> [3,]  5.080602e-05
#> [4,] -1.368645e-05
#> [5,] -3.867333e-05
#> [6,]  5.092445e-04
    cov_matnorm_flipped
#>               [,1]          [,2]          [,3]          [,4]          [,5]
#> [1,]  2.170470e-03 -2.527128e-05  5.424053e-05  2.596522e-05 -4.494087e-05
#> [2,] -2.527128e-05  4.900120e-04 -4.077981e-05  3.050041e-05 -4.268714e-05
#> [3,]  5.424053e-05 -4.077981e-05  2.134899e-03 -1.962760e-05 -1.483435e-04
#> [4,]  2.596522e-05  3.050041e-05 -1.962760e-05  4.899137e-04  1.777036e-05
#> [5,] -4.494087e-05 -4.268714e-05 -1.483435e-04  1.777036e-05  2.298639e-03
#> [6,]  3.335734e-05 -1.059047e-05  6.319208e-05  1.462154e-05 -1.359489e-05
#>               [,6]
#> [1,]  3.335734e-05
#> [2,] -1.059047e-05
#> [3,]  6.319208e-05
#> [4,]  1.462154e-05
#> [5,] -1.359489e-05
#> [6,]  5.341030e-04

Created on 2021-11-03 by the reprex package (v2.0.1)

phargarten2 · 2021-11-03T15:33:30Z

BTW, thanks to your comments @prockenschaub , the help file for the argument methods in rmatnorm is confusing (there are two "defaults") and needs to be updated.

method	
String specifying the matrix decomposition used to determine the matrix root of the Kronecker product of U and V in rmatnorm. Possible methods are eigenvalue decomposition **("eigen", default),** singular value decomposition ("svd"), and Cholesky decomposition ("chol"). The Cholesky (the default) is typically fastest, but not by much though. Passed to rmvnorm.

As you can see in the function, "eigen" is not the default method and needs to be changed to "chol".

prockenschaub · 2021-11-03T16:47:45Z

Thank you for taking the time to look into this! It's been a little since I worked with this, so hopefully I won't misremember :)

Having read through your explanations, I understand that U is an n-by-n matrix, where n is the number of rows, and V is a p-by-p matrix, where p is the number of columns. This makes sense and aligns e.g. with how the matrix normal distribution is described on Wikipedia. The disagreement then stems from whether vec(X) ~ Multivariate-Normal(vec(M), kronecker(U, V)) or vec(X) ~ Multivariate-Normal(vec(M), kronecker(V, U)), and it is surprisingly difficult to find a reliable reference for one or the other.

Before writing my earlier comment in August, I read your reference of Iranmanesh et al. (2010) and I agree with your conclusion that their definition suggests kronecker(U, V).

However, Wikipedia — citing work by Gupta and Nagar (1999) — suggest the opposite kronecker(V, U). Unfortunately, I do not have access to the original work by Gupta and Nagar, but other more recent work such as Pocuca et al. (2019) also use this definition (again citing Gupta and Nagar).

In light of the simulation results in my earlier post and the rather obscure journal in which Iranmanesh et al. (2010) was published, I tend to assume that the definiton in Iranmanesh et al. is incorrect and think instead kronecker(V, U) is the correct definition.

This also makes sense when thinking about the resulting covariance matrix. Using kronecker(V, U), the covariance of the first n entries of vec(X) (i.e., the first column p=1 of X) are represented together in the left-upper corner of the covariance matrix as v_11 * U, where v_11 is the scalar in V in row 1 and column 1 and which represents the variance of that column (in the multivariate linear regression in my earlier example, it is the residual variance of the first outcome variable). When kronecker(U, V) is used instead, no sensible covariance matrix arises since the upper left block would be u_11 * V, which can't be reasonably linked to vec(X).

phargarten2 · 2021-11-03T18:13:26Z

Thank you, @prockenschaub. Your recent note clarifies many items for me. Thank you for bring the literature to my attention; I was unaware of Pocuca et al. (2019). Using an example from @stla, I think I see what you mean.

m <- 3  #subjects
p <- 2  #covariates 
M <- matrix(0, m, p)
U <- toeplitz(m:1); colnames(U) <- rownames(U) <- as.character(1:3)  #row covariance matrix
V <- toeplitz(p:1); colnames(V) <- rownames(V) <- c("beta0", "beta1") #column covariance matrix 

U
#>   1 2 3
#> 1 3 2 1
#> 2 2 3 2
#> 3 1 2 3

V
#>       beta0 beta1
#> beta0     2     1
#> beta1     1     2

#Covariance matrix of X, current 
kronecker(U, V, make.dimnames = TRUE)
#>         1:beta0 1:beta1 2:beta0 2:beta1 3:beta0 3:beta1
#> 1:beta0       6       3       4       2       2       1
#> 1:beta1       3       6       2       4       1       2
#> 2:beta0       4       2       6       3       4       2
#> 2:beta1       2       4       3       6       2       4
#> 3:beta0       2       1       4       2       6       3
#> 3:beta1       1       2       2       4       3       6

#Covariance matrix of X, makes sense, proposed
kronecker(V, U, make.dimnames = TRUE)
#>         beta0:1 beta0:2 beta0:3 beta1:1 beta1:2 beta1:3
#> beta0:1       6       4       2       3       2       1
#> beta0:2       4       6       4       2       3       2
#> beta0:3       2       4       6       1       2       3
#> beta1:1       3       2       1       6       4       2
#> beta1:2       2       3       2       4       6       4
#> beta1:3       1       2       3       2       4       6

Created on 2021-11-03 by the reprex package (v2.0.1)

The first makes no sense, while the second makes sense....especially since vec() takes the first row, then second, etc.

@prockenschaub

# matrixNormal 0.1.0 2021-Dec-29 - `pmatnorm`: Included argument `keepAttr` to pass to `mvtnorm::pmvtnorm()` v.1.1-2: logical allowing users to attach error and message to the return value. See . - `rmatnorm`: - Added argument `checkSymmetry = FALSE` to `mvtnorm::rmvnorm()` as the matrices are already checked for symmetry. See . - U and V are incorrectly switched, (thanks, @prockenschaub, <#1>) - Added warning to earlier versions of R - Changed reference cited. - Made argument method clearer - Documentation clarification, thanks \#1 # matrixNormal 0.0.5 2021-Apr-1 ## Bug Fixes - Minor Change: Changed lazy data to FALSE in description file to remove notes on CRAN check in R 4.0.5. # matrixNormal 0.0.4 2020-08-26 - New `vech()` function: performs half-vectorization on a symmetric matrix. This is the vector-form of the lower triangular matrix only. Unlike other functions on CRAN, vech() inherits any names from the matrix. ## Bug Fixes - `dmatrixnorm`: Clarified by replacing the name of argument `use.log` with `log` for consistency in argument name with mvtnorm and stats package. - `is.symmetric()`, `is.positive.definite()`, `is.positive.semi.definite()`: - if A is not symmetric, these functions NOW return FALSE instead of stopping the function. Restructured helper `find.eval()`. - if A contains a missing value (`NA`), these functions NOW return NA. - `rmatrixnorm`: Added the first argument `s` to draw many random samples. Only 1 sample is still drawn; the argument currently has no effect but acts as a placeholder in future releases. - Clarified documentation. - Added session information and version details to the vignette. The updated versions of the packages listed do not affect the results of matrixNormal functions.

phargarten2 · 2021-12-30T15:42:32Z

Updated in version 0.1.0 (#1)

phargarten2 mentioned this issue Nov 3, 2021

Incompability of Krochener products between rmatrixnormal() and rmatnorm() stla/matrixsampling#1

Open

phargarten2 closed this as completed Dec 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

U and V #1

U and V #1

prockenschaub commented Aug 24, 2021 •

edited

Loading

phargarten2 commented Nov 3, 2021 •

edited

Loading

phargarten2 commented Nov 3, 2021 •

edited

Loading

phargarten2 commented Nov 3, 2021

prockenschaub commented Nov 3, 2021

phargarten2 commented Nov 3, 2021

phargarten2 commented Dec 30, 2021

U and V #1

U and V #1

Comments

prockenschaub commented Aug 24, 2021 • edited Loading

phargarten2 commented Nov 3, 2021 • edited Loading

phargarten2 commented Nov 3, 2021 • edited Loading

phargarten2 commented Nov 3, 2021

prockenschaub commented Nov 3, 2021

phargarten2 commented Nov 3, 2021

phargarten2 commented Dec 30, 2021

prockenschaub commented Aug 24, 2021 •

edited

Loading

phargarten2 commented Nov 3, 2021 •

edited

Loading

phargarten2 commented Nov 3, 2021 •

edited

Loading