diff --git a/DESCRIPTION b/DESCRIPTION index cea02a2..bc1bf02 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: pivmet Type: Package Title: Pivotal Methods for Bayesian Relabelling and k-Means Clustering -Version: 0.2.0 -Date: 2019-06-23 +Version: 0.3.0 +Date: 2020-06-01 Author: Leonardo Egidi[aut, cre], Roberta Pappadà[aut], Francesco Pauli[aut], @@ -21,13 +21,14 @@ Encoding: UTF-8 LazyData: true LazyLoad: yes SystemRequirements: pandoc (>= 1.12.3), pandoc-citeproc -Depends: bayesmix, rjags, mvtnorm, RcmdrMisc -Imports: cluster, mclust, MASS, corpcor, runjags, rstan +Depends: R (>= 3.1.0) +Imports: cluster, mclust, MASS, corpcor, runjags, rstan, bayesmix, + rjags, mvtnorm, RcmdrMisc, bayesplot Suggests: knitr, rmarkdown VignetteBuilder: knitr -RoxygenNote: 6.1.0 +RoxygenNote: 7.1.0 BuildManual: yes NeedsCompilation: no -Packaged: 2019-06-24 16:34:12 UTC; leoeg +Packaged: 2020-06-03 15:34:32 UTC; leoeg Repository: CRAN -Date/Publication: 2019-06-24 16:50:03 UTC +Date/Publication: 2020-06-03 16:00:03 UTC diff --git a/MD5 b/MD5 index 33d66a7..797d13a 100644 --- a/MD5 +++ b/MD5 @@ -1,36 +1,36 @@ -b1bb55055b5109870d8a8dbc9831f3a3 *DESCRIPTION -5d9f03ce2737ba18b95fdc5bb9af8cf4 *NAMESPACE -1408e32bed72983ffb72525b7886e30e *NEWS.md -289515c40ba1d4f146c914dca542a818 *R/bayes_mcmc.R +2d67a32d66963f93eedc11925b50240c *DESCRIPTION +3aa35170c13d0fdb7128cd4e332fe200 *NAMESPACE +466cf791d0f7e2263cebce2e7a53f3eb *NEWS.md +1583004459342bada788b1ae6af719f4 *R/bayes_mcmc.R 4d44c58648506db591eb555dd52f0d0e *R/mus.R -68c6d00c9e15c8f79d91246288890b3f *R/musk.R +3feee1c2b73aad3d60e41e295cad3bbd *R/musk.R d4ed00a925ec98b801a718df387e3895 *R/pivotal.R -9fe17395d40d8b65d166ba0dde949cb8 *R/plot.R -32d5653664bf73e0bf89351d0810c513 *R/relab.R -94e689ce4cad315e568e8500bc70d9cb *R/sim.R +175f218e2732d4607ccec20593d25e98 *R/plot.R +c66feba4425dc375249e06477286274e *R/relab.R +9c01247fc760bb1e584b551eb8761267 *R/sim.R 87606bf157b96b52f5b702f34951f885 *README.md -510662c227cf4dd72a8bcf3ae43469f2 *build/vignette.rds -83e37e5245796587040cd7fbd6a66a90 *inst/doc/K-means_clustering_using_the_MUS_algorithm.R -3ff415f107be9ed39ef989e2680befdf *inst/doc/K-means_clustering_using_the_MUS_algorithm.Rmd -41dd29112371ace304d7a3745aac47e9 *inst/doc/K-means_clustering_using_the_MUS_algorithm.html -e5aaf256b4eb70e3a1995ec90c8cb9be *inst/doc/Relabelling_in_Bayesian_mixtures_by_pivotal_units.R -2287794ea6095c7226a72baa3a0ff00d *inst/doc/Relabelling_in_Bayesian_mixtures_by_pivotal_units.Rmd -1474980080f02950b82b8b0cdae7a62d *inst/doc/Relabelling_in_Bayesian_mixtures_by_pivotal_units.html +6ee2012d1bfa7c51834c10336ece55a6 *build/vignette.rds +9cbb93e3c363d2fbb5d57ead44fa1795 *inst/doc/K-means_clustering_using_the_MUS_algorithm.R +a41635a5daf726815a3dc1ce9dc985e9 *inst/doc/K-means_clustering_using_the_MUS_algorithm.Rmd +1ef5027874d415d8e0881767419e384b *inst/doc/K-means_clustering_using_the_MUS_algorithm.html +39a6c3150de69f5c6e9dec870108ada0 *inst/doc/Relabelling_in_Bayesian_mixtures_by_pivotal_units.R +6cb98f378e7b69662202049abfbc738c *inst/doc/Relabelling_in_Bayesian_mixtures_by_pivotal_units.Rmd +fab773169134384bb43726926cc5291f *inst/doc/Relabelling_in_Bayesian_mixtures_by_pivotal_units.html 14b018d2b8d5b3ff6060ce2dc9be103a *man/MUS.Rd 83b55b9e457e02bcfb781b0781f06bf2 *man/figures/README-kmeans_plots-1.png 387d2506d151432d2ab11212700884fa *man/figures/README-musk-1.png 9bb0dfd1d8cdbfa77d0772d3c20c3de9 *man/figures/README-plot-1.png 3d48ce33fa5475a43821cfcc735faba4 *man/figures/README-plot-2.png -0d02a9db83f0edc9cf8d03726fe5697e *man/piv_KMeans.Rd -3fd1b7bef45588816d749483b83dcd8d *man/piv_MCMC.Rd -280286df4cd9476e674cc91530a7bfdf *man/piv_plot.Rd -a3c65e21d5b6ceb0ad32842638343f47 *man/piv_rel.Rd +6b8a2c6fc409a382f6587387e036a29e *man/piv_KMeans.Rd +8bccbd3092e56186661989d07be35ff8 *man/piv_MCMC.Rd +4ec5457f592dff5313e1752d5418e226 *man/piv_plot.Rd +1495ea94519ec6a1f3dcc1974f85f666 *man/piv_rel.Rd 63c33936022c37de6581db9f5a410ec3 *man/piv_sel.Rd -60cc646bc8387133c832cca8488694ca *man/piv_sim.Rd +0082e12811345c6b9f11b561ea0d639d *man/piv_sim.Rd d565fe1d140bd6770f832d28a6606458 *tests/testthat/test-bayesMCMC.R 1cd9a2b29c1aa0cc8934f47a77800059 *tests/testthat/test-bayesMCMCbiv.R c410e73d8c2d40bdb9c3fe39c6c184fe *tests/testthat/test-mus.R e2d84ff6c6e19838a8101d236bcd74bb *tests/testthat/testthat.R -3ff415f107be9ed39ef989e2680befdf *vignettes/K-means_clustering_using_the_MUS_algorithm.Rmd -2287794ea6095c7226a72baa3a0ff00d *vignettes/Relabelling_in_Bayesian_mixtures_by_pivotal_units.Rmd -4ba2fe1c5cb39e918396e888bd206053 *vignettes/ref.bib +a41635a5daf726815a3dc1ce9dc985e9 *vignettes/K-means_clustering_using_the_MUS_algorithm.Rmd +6cb98f378e7b69662202049abfbc738c *vignettes/Relabelling_in_Bayesian_mixtures_by_pivotal_units.Rmd +19fa512f8bfd56e38f4a5e5fc31b1b2a *vignettes/ref.bib diff --git a/NAMESPACE b/NAMESPACE index d7957c7..a172076 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -1,4 +1,4 @@ -exportPattern("^[[:alpha:]]+") +#exportPattern("^[[:alpha:]]+") export(piv_rel) export(piv_MCMC) export(piv_sel) @@ -7,29 +7,22 @@ export(piv_sim) export(MUS) export(piv_KMeans) import(rjags) -import(bayesmix) -import(mvtnorm) -#import(rstan) +import(bayesplot) +importFrom(bayesmix, BMMmodel, BMMpriors, priorsFish, + JAGScontrol, JAGSrun) +importFrom(mvtnorm, rmvnorm) importFrom(rstan, - #rstan-package, - #as.array, As.mcmc.list, check_hmc_diagnostics , - #Diagnostic plots , expose_stan_functions , extract , extract_sparse_parts , - #log_prob-methods , - #loo.stanfit , lookup , makeconf_path , monitor , optimizing , - #pairs.stanfit, - #print , read_rdump, read_stan_csv, - #rstan-plotting-functions, rstan.package.skeleton , rstan_gg_options , rstan_options , @@ -39,12 +32,9 @@ importFrom(rstan, stan, stanc , #stanfit-class , - #stanmodel-class , - #stan_demo , stan_model, stan_rdump , stan_version , - #summary-methods , traceplot , vb ) diff --git a/NEWS.md b/NEWS.md index 6229753..83c0ae1 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,18 @@ +# pivmet 0.3.0 + +## Major fix + +* multivariate mixtures allowed +* stan divergences +* stan fit available (e.g. for bayesplot package) + +## Minor fix + +* drop packages dependencies +* vignettes updated +* runjags arguments renamed (mu, tau, eta) +* more checks in piv_rel function + # pivmet 0.2.0 * rstan now incorporated to fit mixtures diff --git a/R/bayes_mcmc.R b/R/bayes_mcmc.R index 2ecd5ba..f4224ed 100644 --- a/R/bayes_mcmc.R +++ b/R/bayes_mcmc.R @@ -2,7 +2,7 @@ #' #' Perform MCMC JAGS sampling or HMC Stan sampling for Gaussian mixture models, post-process the chains and apply a clustering technique to the MCMC sample. Pivotal units for each group are selected among four alternative criteria. #' @param y \eqn{N}-dimensional vector for univariate data or -#' \eqn{N \times 2} matrix for bivariate data. +#' \eqn{N \times D} matrix for bivariate data. #' @param k Number of mixture components. #' @param nMC Number of MCMC iterations for the JAGS/Stan function execution. #' @param priors Input prior hyperparameters (see Details for default options). @@ -22,33 +22,33 @@ #' \code{software="rstan"}). Default is 1. #' #' @details -#' The function fits univariate and bivariate Bayesian Gaussian mixture models of the form +#' The function fits univariate and multivariate Bayesian Gaussian mixture models of the form #' (here for univariate only): -#' \deqn{(Y_i|Z_i=j) \sim \mathcal{N}(\mu_j,\phi_j),} +#' \deqn{(Y_i|Z_i=j) \sim \mathcal{N}(\mu_j,\sigma_j),} #' where the \eqn{Z_i}, \eqn{i=1,\ldots,N}, are i.i.d. random variables, \eqn{j=1,\dots,k}, -#' \eqn{\phi_j} is the group variance, \eqn{Z_i \in {1,\ldots,k }} are the +#' \eqn{\sigma_j} is the group variance, \eqn{Z_i \in {1,\ldots,k }} are the #' latent group allocation, and -#' \deqn{P(Z_i=j)=\pi_j.} +#' \deqn{P(Z_i=j)=\eta_j.} #' The likelihood of the model is then -#' \deqn{L(y;\mu,\pi,\phi) = \prod_{i=1}^N \sum_{j=1}^k \pi_j \mathcal{N}(\mu_j,\phi_j),} -#' where \eqn{(\mu, \phi)=(\mu_{1},\dots,\mu_{k},\phi_{1},\ldots,\phi_{k})} -#' are the component-specific parameters and \eqn{\pi=(\pi_{1},\dots,\pi_{k})} +#' \deqn{L(y;\mu,\eta,\sigma) = \prod_{i=1}^N \sum_{j=1}^k \eta_j \mathcal{N}(\mu_j,\sigma_j),} +#' where \eqn{(\mu, \sigma)=(\mu_{1},\dots,\mu_{k},\sigma_{1},\ldots,\sigma_{k})} +#' are the component-specific parameters and \eqn{\eta=(\eta_{1},\dots,\eta_{k})} #' the mixture weights. Let \eqn{\nu} denote a permutation of \eqn{{ 1,\ldots,k }}, #' and let \eqn{\nu(\mu)= (\mu_{\nu(1)},\ldots,} \eqn{ \mu_{\nu(k)})}, -#' \eqn{\nu(\phi)= (\phi_{\nu(1)},\ldots,} \eqn{ \phi_{\nu(k)})}, -#' \eqn{ \nu(\pi)=(\pi_{\nu(1)},\ldots,\pi_{\nu(k)})} be the -#' corresponding permutations of \eqn{\mu}, \eqn{\phi} and \eqn{\pi}. +#' \eqn{\nu(\sigma)= (\sigma_{\nu(1)},\ldots,} \eqn{ \sigma_{\nu(k)})}, +#' \eqn{ \nu(\eta)=(\eta_{\nu(1)},\ldots,\eta_{\nu(k)})} be the +#' corresponding permutations of \eqn{\mu}, \eqn{\sigma} and \eqn{\eta}. #' Denote by \eqn{V} the set of all the permutations of the indexes #' \eqn{{1,\ldots,k }}, the likelihood above is invariant under any #' permutation \eqn{\nu \in V}, that is #' \deqn{ -#' L(y;\mu,\pi,\phi) = L(y;\nu(\mu),\nu(\pi),\nu(\phi)).} +#' L(y;\mu,\eta,\sigma) = L(y;\nu(\mu),\nu(\eta),\nu(\sigma)).} #' As a consequence, the model is unidentified with respect to an #' arbitrary permutation of the labels. #' When Bayesian inference for the model is performed, -#' if the prior distribution \eqn{p_0(\mu,\pi,\phi)} is invariant under a permutation of the indices, then so is the posterior. That is, if \eqn{p_0(\mu,\pi,\phi) = p_0(\nu(\mu),\nu(\pi),\phi)}, then +#' if the prior distribution \eqn{p_0(\mu,\eta,\sigma)} is invariant under a permutation of the indices, then so is the posterior. That is, if \eqn{p_0(\mu,\eta,\sigma) = p_0(\nu(\mu),\nu(\eta),\sigma)}, then #'\deqn{ -#' p(\mu,\pi,\phi| y) \propto p_0(\mu,\pi,\phi)L(y;\mu,\pi,\phi)} +#' p(\mu,\eta,\sigma| y) \propto p_0(\mu,\eta,\sigma)L(y;\mu,\eta,\sigma)} #' is multimodal with (at least) \eqn{k!} modes. #' #' Depending on the selected software, the model parametrization @@ -63,8 +63,8 @@ #' \code{bayesmix} package: #' #' \deqn{\mu_j \sim \mathcal{N}(\mu_0, 1/B0inv)} -#' \deqn{\phi_j \sim \mbox{invGamma}(nu0Half, nu0S0Half)} -#' \deqn{\pi \sim \mbox{Dirichlet}(1,\ldots,1)} +#' \deqn{\sigma_j \sim \mbox{invGamma}(nu0Half, nu0S0Half)} +#' \deqn{\eta \sim \mbox{Dirichlet}(1,\ldots,1)} #' \deqn{S0 \sim \mbox{Gamma}(g0Half, g0G0Half),} #' #' with default values: \eqn{\mu_0=0, B0inv=0.1, nu0Half =10, S0=2, @@ -80,19 +80,19 @@ #' When \code{software="rstan"}, the prior specification is: #' #' \deqn{\mu_j \sim \mathcal{N}(\mu_0, 1/B0inv)} -#' \deqn{\phi_j \sim \mbox{Lognormal}(\mu_{\phi}, \sigma_{\phi})} -#' \deqn{\pi_j \sim \mbox{Uniform}(0,1),} +#' \deqn{\sigma_j \sim \mbox{Lognormal}(\mu_{\sigma}, \tau_{\sigma})} +#' \deqn{\eta_j \sim \mbox{Uniform}(0,1),} #' -#' with default values: \eqn{\mu_0=0, B0inv=0.1, \mu_{\phi}=0, \sigma_{\phi}=2}. +#' with default values: \eqn{\mu_0=0, B0inv=0.1, \mu_{\sigma}=0, \tau_{\sigma}=2}. #' The users may specify new hyperparameter values with the argument: #' -#' \code{priors=list(mu_0=1, B0inv=0.2, mu_phi=3, sigma_phi=5)} +#' \code{priors=list(mu_0=1, B0inv=0.2, mu_sigma=3, tau_sigma=5)} #' -#'For bivariate mixtures, when \code{software="rjags"} the prior specification is the following: +#'For multivariate mixtures, when \code{software="rjags"} the prior specification is the following: #' -#'\deqn{ \bm{\mu}_j \sim \mathcal{N}_2(\bm{\mu}_0, S2)} -#'\deqn{ 1/\Sigma \sim \mbox{Wishart}(S3, 3)} -#'\deqn{\pi \sim \mbox{Dirichlet}(\bm{\alpha}),} +#'\deqn{ \bm{\mu}_j \sim \mathcal{N}_D(\bm{\mu}_0, S2)} +#'\deqn{ \Sigma^{-1} \sim \mbox{Wishart}(S3, D+1)} +#'\deqn{\eta \sim \mbox{Dirichlet}(\bm{\alpha}),} #' #'where \eqn{\bm{\alpha}} is a \eqn{k}-dimensional vector #'and \eqn{S_2} and \eqn{S_3} @@ -110,27 +110,27 @@ #' #'When \code{software="rstan"}, the prior specification is: #' -#'\deqn{ \bm{\mu}_j \sim \mathcal{N}_2(\bm{\mu}_0, LDL^{T})} -#'\deqn{L \sim \mbox{LKJ}(\eta)} -#'\deqn{D_j \sim \mbox{HalfCauchy}(0, \sigma_d).} +#'\deqn{ \bm{\mu}_j \sim \mathcal{N}_D(\bm{\mu}_0, LD*L^{T})} +#'\deqn{L \sim \mbox{LKJ}(\epsilon)} +#'\deqn{D^*_j \sim \mbox{HalfCauchy}(0, \sigma_d).} #' -#'The covariance matrix is expressed in terms of the LDL decomposition as \eqn{LDL^{T}}, -#'a variant of the classical Cholesky decomposition, where \eqn{L} is a \eqn{2 \times 2} -#'lower unit triangular matrix and \eqn{D} is a \eqn{2 \times 2} diagonal matrix. -#'The Cholesky correlation factor \eqn{L} is assigned a LKJ prior with \eqn{\eta} degrees of freedom, which, +#'The covariance matrix is expressed in terms of the LDL decomposition as \eqn{LD*L^{T}}, +#'a variant of the classical Cholesky decomposition, where \eqn{L} is a \eqn{D \times D} +#'lower unit triangular matrix and \eqn{D*} is a \eqn{D \times D} diagonal matrix. +#'The Cholesky correlation factor \eqn{L} is assigned a LKJ prior with \eqn{\epsilon} degrees of freedom, which, #'combined with priors on the standard deviations of each component, induces a prior on the covariance matrix; -#'as \eqn{\eta \rightarrow \infty} the magnitude of correlations between components decreases, -#'whereas \eqn{\eta=1} leads to a uniform prior distribution for \eqn{L}. -#'By default, the hyperparameters are \eqn{\bm{\mu}_0=\bm{0}}, \eqn{\sigma_d=2.5, \eta=1}. +#'as \eqn{\epsilon \rightarrow \infty} the magnitude of correlations between components decreases, +#'whereas \eqn{\epsilon=1} leads to a uniform prior distribution for \eqn{L}. +#'By default, the hyperparameters are \eqn{\bm{\mu}_0=\bm{0}}, \eqn{\sigma_d=2.5, \epsilon=1}. #'The user may propose some different values with the argument: #' #' -#' \code{priors=list(mu_0=c(1,2), sigma_d = 4, eta =2)} +#' \code{priors=list(mu_0=c(1,2), sigma_d = 4, epsilon =2)} #' #' #' If \code{software="rjags"} the function performs JAGS sampling using the \code{bayesmix} package #' for univariate Gaussian mixtures, and the \code{runjags} -#' package for bivariate Gaussian mixtures. If \code{software="rstan"} the function performs +#' package for multivariate Gaussian mixtures. If \code{software="rstan"} the function performs #' Hamiltonian Monte Carlo (HMC) sampling via the \code{rstan} package (see the vignette and the Stan project #' for any help). #' @@ -162,21 +162,21 @@ #' vector.} #' \item{\code{mcmc_mean}}{ If \code{y} is a vector, a \eqn{true.iter \times k} #' matrix with the post-processed MCMC chains for the mean parameters; if -#' \code{y} is a matrix, a \eqn{true.iter \times 2 \times k} array with +#' \code{y} is a matrix, a \eqn{true.iter \times D \times k} array with #' the post-processed MCMC chains for the mean parameters.} #' \item{\code{mcmc_sd}}{ If \code{y} is a vector, a \eqn{true.iter \times k} #' matrix with the post-processed MCMC chains for the sd parameters; if -#' \code{y} is a matrix, a \eqn{true.iter \times 2} array with +#' \code{y} is a matrix, a \eqn{true.iter \times D} array with #' the post-processed MCMC chains for the sd parameters.} #' \item{\code{mcmc_weight}}{A \eqn{true.iter \times k} #' matrix with the post-processed MCMC chains for the weights parameters.} #'\item{\code{mcmc_mean_raw}}{ If \code{y} is a vector, a \eqn{(nMC-burn) \times k} matrix #' with the raw MCMC chains for the mean parameters as given by JAGS; if -#' \code{y} is a matrix, a \eqn{(nMC-burn) \times 2 \times k} array with the raw MCMC chains +#' \code{y} is a matrix, a \eqn{(nMC-burn) \times D \times k} array with the raw MCMC chains #' for the mean parameters as given by JAGS/Stan.} #' \item{\code{mcmc_sd_raw}}{ If \code{y} is a vector, a \eqn{(nMC-burn) \times k} matrix #' with the raw MCMC chains for the sd parameters as given by JAGS/Stan; if -#' \code{y} is a matrix, a \eqn{(nMC-burn) \times 2} array with the raw MCMC chains +#' \code{y} is a matrix, a \eqn{(nMC-burn) \times D} array with the raw MCMC chains #' for the sd parameters as given by JAGS/Stan.} #' \item{\code{mcmc_weight_raw}}{A \eqn{(nMC-burn) \times k} matrix #' with the raw MCMC chains for the weights parameters as given by JAGS/Stan.} @@ -185,6 +185,7 @@ #' \code{"diana"} or \code{"hclust"}.} #' \item{\code{pivots}}{The vector of indices of pivotal units identified by the selected pivotal criterion.} #' \item{\code{model}}{The JAGS/Stan model code. Apply the \code{"cat"} function for a nice visualization of the code.} +#' \item{\code{stanfit}}{An object of S4 class \code{stanfit} for the fitted model (only if \code{software="rstan"}).} #' #' @author Leonardo Egidi \email{legidi@units.it} #' @references Egidi, L., Pappadà, R., Pauli, F. and Torelli, N. (2018). Relabelling in Bayesian Mixture @@ -196,15 +197,15 @@ #'\dontrun{ #' N <- 200 #' k <- 4 +#' D <- 2 #' nMC <- 1000 -#' M1 <-c(-.5,8) +#' M1 <- c(-.5,8) #' M2 <- c(25.5,.1) #' M3 <- c(49.5,8) #' M4 <- c(63.0,.1) -#' Mu <- matrix(rbind(M1,M2,M3,M4),c(4,2)) -#' sds <- cbind(rep(1,k), rep(20,k)) -#' Sigma.p1 <- matrix(c(sds[1,1]^2,0,0,sds[1,1]^2), nrow=2, ncol=2) -#' Sigma.p2 <- matrix(c(sds[1,2]^2,0,0,sds[1,2]^2), nrow=2, ncol=2) +#' Mu <- rbind(M1,M2,M3,M4) +#' Sigma.p1 <- diag(D) +#' Sigma.p2 <- 20*diag(D) #' W <- c(0.2,0.8) #' sim <- piv_sim(N = N, k = k, Mu = Mu, #' Sigma.p1 = Sigma.p1, @@ -230,6 +231,7 @@ #' ### Fishery data (bayesmix package) #' #'\dontrun{ +#' library(bayesmix) #' data(fish) #' y <- fish[,1] #' k <- 5 @@ -248,13 +250,14 @@ + piv_MCMC <- function(y, k, nMC, priors, - piv.criterion = "maxsumdiff", - clustering = "diana", - software = "rjags", + piv.criterion = c("MUS", "maxsumint", "minsumnoint", "maxsumdiff"), + clustering = c("diana", "hclust"), + software = c("rjags", "rstan"), burn =0.5*nMC, chains = 4, cores = 1){ @@ -262,30 +265,21 @@ piv_MCMC <- function(y, #### checks # piv.criterion - list_crit <- c("MUS", "maxsumint", "minsumnoint", "maxsumdiff") - if (sum(piv.criterion!=list_crit)==4){ - stop(paste("object ", "'", piv.criterion,"'", " not found. - Please select one among the following pivotal - criteria: MUS, maxsumint, minsumnoint, maxsumdiff", sep="")) + if (missing(piv.criterion)){ + piv.criterion <- "maxsumdiff" } + list_crit <- c("MUS", "maxsumint", "minsumnoint", "maxsumdiff") + piv.criterion <- match.arg(piv.criterion, list_crit) # clustering list_clust <- c("diana", "hclust") - if (sum(clustering!=list_clust)==2){ - stop(paste("object ", "'", clustering,"'", " not found. - Please select one among the following - clustering methods: diana, hclust", sep="")) - } + clustering <- match.arg(clustering, list_clust) # software list_soft <- c("rjags", "rstan") - if (sum(software!=list_soft)==2){ - stop(paste("object ", "'", software,"'", " not found. - Please select one among the following - softwares: rjags, rstan", sep="")) - } + software <- match.arg(software, list_soft) # burn-in @@ -295,7 +289,6 @@ piv_MCMC <- function(y, } - ### # Conditions about data dimension---------------- @@ -438,8 +431,8 @@ piv_MCMC <- function(y, if(missing(priors)){ mu_0 <- 0 B0inv <- 0.1 - mu_phi <- 0 - sigma_phi <- 2 + mu_sigma <- 0 + tau_sigma <- 2 }else{ if (is.null(priors$mu_0)){ mu_0 <- 0 @@ -451,21 +444,21 @@ piv_MCMC <- function(y, }else{ B0inv <- priors$B0inv } - if (is.null(priors$mu_phi)){ - mu_phi <- 0 + if (is.null(priors$mu_sigma)){ + mu_sigma <- 0 }else{ - mu_phi <- priors$mu_phi + mu_sigma <- priors$mu_sigma } - if (is.null(priors$sigma_phi)){ - sigma_phi <- 2 + if (is.null(priors$tau_sigma)){ + tau_sigma <- 2 }else{ - sigma_phi <- priors$sigma_phi + tau_sigma <- priors$tau_sigma } } data = list(N=N, y=y, k=k, mu_0=mu_0, B0inv=B0inv, - mu_phi=mu_phi, sigma_phi=sigma_phi) + mu_sigma=mu_sigma, tau_sigma=tau_sigma) mix_univ <-" data { int k; // number of mixture components @@ -473,31 +466,31 @@ piv_MCMC <- function(y, real y[N]; // observations real mu_0; // mean hyperparameter real B0inv; // mean hyperprecision - real mu_phi; // sigma hypermean - real sigma_phi; // sigma hyper sd + real mu_sigma; // sigma hypermean + real tau_sigma; // sigma hyper sd } parameters { - simplex[k] theta; // mixing proportions + simplex[k] eta; // mixing proportions ordered[k] mu; // locations of mixture components vector[k] sigma; // scales of mixture components } transformed parameters{ - vector[k] log_theta = log(theta); // cache log calculation + vector[k] log_eta = log(eta); // cache log calculation vector[k] pz[N]; simplex[k] exp_pz[N]; for (n in 1:N){ pz[n] = normal_lpdf(y[n]|mu, sigma)+ - log_theta- + log_eta- log_sum_exp(normal_lpdf(y[n]|mu, sigma)+ - log_theta); + log_eta); exp_pz[n] = exp(pz[n]); } } model { - sigma ~ lognormal(mu_phi, sigma_phi); + sigma ~ lognormal(mu_sigma, tau_sigma); mu ~ normal(mu_0, 1/B0inv); for (n in 1:N) { - vector[k] lps = log_theta; + vector[k] lps = log_eta; for (j in 1:k){ lps[j] += normal_lpdf(y[n] | mu[j], sigma[j]); target+=pz[n,j]; @@ -516,14 +509,15 @@ piv_MCMC <- function(y, data=data, chains =chains, iter =nMC) - printed <- cat(print(fit_univ, pars =c("mu", "theta", "sigma"))) + stanfit <- fit_univ + printed <- cat(print(fit_univ, pars =c("mu", "eta", "sigma"))) sims_univ <- rstan::extract(fit_univ) J <- 3 - mcmc.pars <- array(data = NA, dim = c(dim(sims_univ$theta)[1], k, J)) + mcmc.pars <- array(data = NA, dim = c(dim(sims_univ$eta)[1], k, J)) mcmc.pars[ , , 1] <- sims_univ$mu mcmc.pars[ , , 2] <- sims_univ$sigma - mcmc.pars[ , , 3] <- sims_univ$theta + mcmc.pars[ , , 3] <- sims_univ$eta mu_pre_switch_compl <- mcmc.pars[ , , 1] tau_pre_switch_compl <- mcmc.pars[ , , 2] @@ -609,13 +603,15 @@ piv_MCMC <- function(y, }else if (is.matrix(y)){ N <- dim(y)[1] + D <- dim(y)[2] # Parameters' initialization clust_inits <- KMeans(y, k)$cluster #cutree(hclust(dist(y), "average"),k) - mu_inits <- matrix(0,k,2) + mu_inits <- matrix(0,k,D) for (j in 1:k){ - mu_inits[j,] <- cbind(mean(y[clust_inits==j,1]), mean(y[clust_inits==j,2])) - } + for (d in 1:D){ + mu_inits[j, d] <- mean(y[clust_inits==j,d]) + }} #Reorder mu_inits according to the x-coordinate mu_inits <- mu_inits[sort(mu_inits[,1], decreasing=FALSE, index.return=TRUE)$ix,] @@ -626,23 +622,23 @@ piv_MCMC <- function(y, # Initial values if (missing(priors)){ - mu_0 <- as.vector(c(0,0)) - S2 <- matrix(c(1,0,0,1),nrow=2)/100000 - S3 <- matrix(c(1,0,0,1),nrow=2)/100000 + mu_0 <- rep(0, D) + S2 <- diag(D)/100000 + S3 <- diag(D)/100000 alpha <- rep(1,k) }else{ if (is.null(priors$mu_0)){ - mu_0 <- as.vector(c(0,0)) + mu_0 <- rep(0, D) }else{ mu_0 <- priors$mu_0 } if (is.null(priors$S2)){ - S2 <- matrix(c(1,0,0,1),nrow=2)/100000 + S2 <- diag(D)/100000 }else{ S2 <- priors$S2 } if (is.null(priors$S3)){ - S3 <- matrix(c(1,0,0,1),nrow=2)/100000 + S3 <- diag(D)/100000 }else{ S3 <- priors$S3 } @@ -654,7 +650,7 @@ piv_MCMC <- function(y, } # Data - dati.biv <- list(y = y, N = N, k = k, + dati.biv <- list(y = y, N = N, k = k, D = D, S2= S2, S3= S3, mu_0=mu_0, alpha = alpha) @@ -663,26 +659,27 @@ piv_MCMC <- function(y, # Likelihood: for (i in 1:N){ - yprev[i,1:2]<-y[i,1:2] - y[i,1:2] ~ dmnorm(muOfClust[clust[i],],tauOfClust) - clust[i] ~ dcat(pClust[1:k] ) + yprev[i,1:D]<-y[i,1:D] + y[i,1:D] ~ dmnorm(mu[clust[i],],tau) + clust[i] ~ dcat(eta[1:k] ) } # Prior: for (g in 1:k) { - muOfClust[g,1:2] ~ dmnorm(mu_0[],S2[,])} - tauOfClust[1:2,1:2] ~ dwish(S3[,],3) - Sigma[1:2,1:2] <- inverse(tauOfClust[,]) - pClust[1:k] ~ ddirch(alpha) + mu[g,1:D] ~ dmnorm(mu_0[],S2[,])} + tau[1:D,1:D] ~ dwish(S3[,],D+1) + Sigma[1:D,1:D] <- inverse(tau[,]) + eta[1:k] ~ ddirch(alpha) }" init1.biv <- list() - for (s in 1:chains) - init1.biv[[s]] <- dump.format(list(muOfClust=mu_inits, - tauOfClust= matrix(c(15,0,0,15),ncol=2), - pClust=rep(1/k,k), clust=clust_inits)) - moni.biv <- c("clust","muOfClust","tauOfClust","pClust") + for (s in 1:chains){ + init1.biv[[s]] <- dump.format(list(mu=mu_inits, + tau= 15*diag(D), + eta=rep(1/k,k), clust=clust_inits)) + } + moni.biv <- c("clust","mu","tau","eta") mod <- mod.mist.biv dati <- dati.biv @@ -693,9 +690,9 @@ piv_MCMC <- function(y, ogg.jags <- run.jags(model=mod, data=dati, monitor=moni, inits=init1, n.chains=chains,plots=FALSE, thin=1, sample=nMC, burnin=burn) - printed <- print(add.summary(ogg.jags, vars= c("muOfClust", - "tauOfClust", - "pClust"))) + printed <- print(add.summary(ogg.jags, vars= c("mu", + "tau", + "eta"))) # Extraction ris <- ogg.jags$mcmc[[1]] @@ -703,14 +700,17 @@ piv_MCMC <- function(y, group <- ris[-(1:burn),grep("clust[",colnames(ris),fixed=TRUE)] # only the variances - tau <- sqrt( (1/ris[-(1:burn),grep("tauOfClust[",colnames(ris),fixed=TRUE)])[,c(1,4)]) - prob.st <- ris[-(1:burn),grep("pClust[",colnames(ris),fixed=TRUE)] + tau <- sqrt( (1/ris[-(1:burn),grep("tau[",colnames(ris),fixed=TRUE)])[,c(1,4)]) + prob.st <- ris[-(1:burn),grep("eta[",colnames(ris),fixed=TRUE)] M <- nrow(group) H <- list() - mu_pre_switch_compl <- array(rep(0, M*2*k), dim=c(M,2,k)) + mu_pre_switch_compl <- array(rep(0, M*D*k), dim=c(M,D,k)) for (i in 1:k){ - H[[i]] <- ris[-(1:burn),grep("muOfClust",colnames(ris),fixed=TRUE)][,c(i,i+k)] + H[[i]] <- ris[-(1:burn), + grep(paste("mu[",i, sep=""), + colnames(ris),fixed=TRUE)] + #[,c(i,i+k)] } for (i in 1:k){ mu_pre_switch_compl[,,i] <- as.matrix(H[[i]]) @@ -725,9 +725,11 @@ piv_MCMC <- function(y, #return(1) }else{ L<-list() - mu_pre_switch <- array(rep(0, true.iter*2*k), dim=c(true.iter,2,k)) + mu_pre_switch <- array(rep(0, true.iter*D*k), dim=c(true.iter,D,k)) for (i in 1:k){ - L[[i]] <- ris[,grep("muOfClust",colnames(ris),fixed=TRUE)][,c(i,i+k)] + L[[i]] <- ris[,grep(paste("mu[", i, sep=""), + colnames(ris),fixed=TRUE)] + #[,c(i,i+k)] } for (i in 1:k){ mu_pre_switch[,,i] <- as.matrix(L[[i]]) @@ -742,26 +744,26 @@ piv_MCMC <- function(y, mcmc_weight_raw = prob.st mcmc_sd_raw = tau - tau <- sqrt( (1/ris[,grep("tauOfClust[",colnames(ris),fixed=TRUE)])[,c(1,4)]) - prob.st <- ris[,grep("pClust[",colnames(ris),fixed=TRUE)] + tau <- sqrt( (1/ris[,grep("tau[",colnames(ris),fixed=TRUE)])[,c(1,4)]) + prob.st <- ris[,grep("eta[",colnames(ris),fixed=TRUE)] mu <- mu_pre_switch }else if(software=="rstan"){ if (missing(priors)){ - mu_0 <- c(0,0) - eta <- 1 + mu_0 <- rep(0, D) + epsilon <- 1 sigma_d <- 2.5 }else{ if (is.null(priors$mu_0)){ - mu_0 <- c(0,0) + mu_0 <- rep(0, D) }else{ mu_0 <- priors$mu_0 } - if (is.null(priors$eta)){ - eta <- 1 + if (is.null(priors$epsilon)){ + epsilon <- 1 }else{ - eta <- priors$eta + epsilon <- priors$epsilon } if (is.null(priors$sigma_d)){ sigma_d <- 2.5 @@ -769,8 +771,8 @@ piv_MCMC <- function(y, sigma_d <- priors$sigma_d } } - data =list(N=N, k=k, y=y, D=2, mu_0=mu_0, - eta = eta, sigma_d = sigma_d) + data =list(N=N, k=k, y=y, D=D, mu_0=mu_0, + epsilon = epsilon, sigma_d = sigma_d) mix_biv <- " data { int k; // number of mixture components @@ -778,11 +780,11 @@ piv_MCMC <- function(y, int D; // data dimension matrix[N,D] y; // observations matrix vector[D] mu_0; - real eta; + real epsilon; real sigma_d; } parameters { - simplex[k] theta; // mixing proportions + simplex[k] eta; // mixing proportions vector[D] mu[k]; // locations of mixture components cholesky_factor_corr[D] L_Omega; // scales of mixture components vector[D] L_sigma; @@ -790,7 +792,7 @@ piv_MCMC <- function(y, vector[D] L_tau; } transformed parameters{ - vector[k] log_theta = log(theta); // cache log calculation + vector[k] log_eta = log(eta); // cache log calculation vector[k] pz[N]; simplex[k] exp_pz[N]; matrix[D,D] L_Sigma=diag_pre_multiply(L_sigma, L_Omega); @@ -799,19 +801,19 @@ piv_MCMC <- function(y, for (n in 1:N){ pz[n]= multi_normal_cholesky_lpdf(y[n]|mu, L_Sigma)+ - log_theta- + log_eta- log_sum_exp(multi_normal_cholesky_lpdf(y[n]| mu, L_Sigma)+ - log_theta); + log_eta); exp_pz[n] = exp(pz[n]); } } model{ - L_Omega ~ lkj_corr_cholesky(eta); + L_Omega ~ lkj_corr_cholesky(epsilon); L_sigma ~ cauchy(0, sigma_d); mu ~ multi_normal_cholesky(mu_0, L_Tau); for (n in 1:N) { - vector[k] lps = log_theta; + vector[k] lps = log_eta; for (j in 1:k){ lps[j] += multi_normal_cholesky_lpdf(y[n] | mu[j], L_Sigma); @@ -831,7 +833,8 @@ piv_MCMC <- function(y, data=data, chains =chains, iter =nMC) - printed <- cat(print(fit_biv, pars=c("mu", "theta", "L_Sigma"))) + stanfit <- fit_biv + printed <- cat(print(fit_biv, pars=c("mu", "eta", "L_Sigma"))) sims_biv <- rstan::extract(fit_biv) # Extraction @@ -840,10 +843,10 @@ piv_MCMC <- function(y, # Post- process of the chains---------------------- group <- sims_biv$z tau <- sims_biv$L_sigma - prob.st <- sims_biv$theta + prob.st <- sims_biv$eta M <- nrow(group) - mu_pre_switch_compl <- array(rep(0, M*2*k), dim=c(M,2,k)) + mu_pre_switch_compl <- array(rep(0, M*D*k), dim=c(M,D,k)) for (i in 1:M) mu_pre_switch_compl[i,,] <- t(sims_biv$mu[i,,]) # Discard iterations @@ -856,7 +859,7 @@ piv_MCMC <- function(y, #return(1) }else{ - mu_pre_switch <- array(rep(0, true.iter*2*k), dim=c(true.iter,2,k)) + mu_pre_switch <- array(rep(0, true.iter*D*k), dim=c(true.iter,D,k)) for (i in 1:true.iter) mu_pre_switch[i,,] <- t(sm[i,,]) } @@ -868,7 +871,7 @@ piv_MCMC <- function(y, group <- sims_biv$z[numeffettivogruppi==k,] mu <- mu_pre_switch tau <- sims_biv$L_sigma[numeffettivogruppi==k, ] - prob.st <- sims_biv$theta[numeffettivogruppi==k,] + prob.st <- sims_biv$eta[numeffettivogruppi==k,] FreqGruppiJags <- table(group) model_code <- mix_biv @@ -892,7 +895,7 @@ piv_MCMC <- function(y, if (cont > 1){ k <- cont } - mu_switch <- array(rep(0, true.iter*2*k), dim=c(true.iter,2,k)) + mu_switch <- array(rep(0, true.iter*D*k), dim=c(true.iter,D,k)) prob.st_switch <- array(0, dim=c(true.iter,k)) group <- group*0 z <- array(0,dim=c(N, k, true.iter)) @@ -977,7 +980,7 @@ piv_MCMC <- function(y, if (k <=4 & sum(C==0)!=0){ mus_res <- MUS(C, grr) - clust <- mus_res$pivots + pivots <- mus_res$pivots }else{ @@ -988,6 +991,10 @@ piv_MCMC <- function(y, } } + if (software == "rjags"){ + stanfit = NULL + } + return(list( true.iter = true.iter, @@ -1005,5 +1012,7 @@ piv_MCMC <- function(y, grr=grr, pivots = pivots, #print = printed, - model = model_code)) + model = model_code, + k = k, + stanfit = stanfit)) } diff --git a/R/musk.R b/R/musk.R index ba226f0..0b0abfe 100644 --- a/R/musk.R +++ b/R/musk.R @@ -76,9 +76,9 @@ #'x <- matrix(NA, N,2) #'truegroup <- c( rep(1,n1), rep(2, n2), rep(3, n3)) #' -#' x[1:n1,]=rmvnorm(n1, c(1,5), sigma=diag(2)) -#' x[(n1+1):(n1+n2),]=rmvnorm(n2, c(4,0), sigma=diag(2)) -#' x[(n1+n2+1):(n1+n2+n3),]=rmvnorm(n3, c(6,6), sigma=diag(2)) +#' x[1:n1,] <- rmvnorm(n1, c(1,5), sigma=diag(2)) +#' x[(n1+1):(n1+n2),] <- rmvnorm(n2, c(4,0), sigma=diag(2)) +#' x[(n1+n2+1):(n1+n2+n3),] <- rmvnorm(n3, c(6,6), sigma=diag(2)) #' #' # Apply piv_KMeans with MUS as pivotal criterion #' @@ -114,7 +114,7 @@ piv_KMeans <- function (x, centers, - alg.type = "KMeans", + alg.type = c("KMeans", "hclust"), method = "average", piv.criterion = c("MUS", "maxsumint", "minsumnoint", "maxsumdiff"), H = 1000, @@ -134,30 +134,17 @@ piv_KMeans <- function (x, centers, # type list_type <- c("KMeans", "hclust") - if (sum(alg.type!=list_type)==2){ - stop(paste("object ", "'", alg.type,"'", " not found. - Please select one among the following algorithms: - KMeans, hclust", sep="")) - } + alg.type <- match.arg(alg.type, list_type) # method list_method <- c("single", "complete", "average", "ward.D", "ward.D2", "mcquitty", "median", "centroid") - if (sum(method!=list_method)==8){ - stop(paste("object ", "'", method,"'", " not found. - Please select one among the following methods for hclust: - single, complete, average, ward.D, ward.D2, - mcquitty, median, centroid", sep="")) - } + method <- match.arg(method, list_method) # piv.criterion list_crit <- c("MUS", "maxsumint", "minsumnoint", "maxsumdiff") - if (sum(piv.criterion!=list_crit)==4){ - stop(paste("object ", "'", piv.criterion,"'", " not found. - Please select one among the following pivotal - criteria: MUS, maxsumint, minsumnoint, maxsumdiff", sep="")) - } + piv.criterion <- match.arg(piv.criterion, list_crit) ### diff --git a/R/plot.R b/R/plot.R index f4b9550..73c5376 100644 --- a/R/plot.R +++ b/R/plot.R @@ -11,6 +11,7 @@ #' #' # Fishery data #'\dontrun{ +#' library(bayesmix) #' data(fish) #' y <- fish[,1] #' N <- length(y) @@ -37,23 +38,23 @@ piv_plot <- function(y, type = c("chains", "hist") ){ colori <- c("red", "green", "violet", "blue") + ### checks + # data dimension + if (!is.null(dim(y))){ + if (dim(y)[2]>2){ + stop("No plots available for D>2!") + } + } + # par list_par <- c("mean", "sd", "weight", "all") - if (sum(par!=list_par)==4){ - stop(paste("object ", "'", par,"'", " not found. - Please select one among the following parameters: - mean, sd, weight, all", sep="")) - } + par <- match.arg(par, list_par) # type list_type <- c("chains", "hist") - if (sum(type!=list_type)==2){ - stop(paste("object ", "'", type,"'", " not found. - Please select one among the following types: - chains, hist", sep="")) - } + type <- match.arg(type, list_type) # missing type diff --git a/R/relab.R b/R/relab.R index 7e2db7e..e3bd3b4 100644 --- a/R/relab.R +++ b/R/relab.R @@ -78,9 +78,9 @@ #' as explained in Details.} #' \item{\code{final_it_p}}{The proportion of final valid MCMC iterations.} #' \item{\code{rel_mean}}{The relabelled chains of the means: a \code{final_it} \eqn{\times k} matrix for univariate data, -#' or a \code{final_it} \eqn{\times 2 \times k} array for bivariate data.} +#' or a \code{final_it} \eqn{\times D \times k} array for multivariate data.} #' \item{\code{rel_sd}}{The relabelled chains of the sd's: a \code{final_it} \eqn{\times k} matrix for univariate data, -#' or a \code{final_it} \eqn{\times 2} matrix for bivariate data.} +#' or a \code{final_it} \eqn{\times D} matrix for multivariate data.} #' \item{\code{rel_weight}}{The relabelled chains of the weights: a \code{final_it} \eqn{\times k} matrix.} #' #' @author Leonardo Egidi \email{legidi@units.it} @@ -108,16 +108,14 @@ #'\dontrun{ #' N <- 200 #' k <- 3 +#' D <- 2 #' nMC <- 5000 #' M1 <- c(-.5,8) #' M2 <- c(25.5,.1) #' M3 <- c(49.5,8) #' Mu <- matrix(rbind(M1,M2,M3),c(k,2)) -#' sds <- cbind(rep(1,k), rep(20,k)) -#' Sigma.p1 <- matrix(c(sds[1,1]^2,0,0,sds[1,1]^2), -#' nrow=2, ncol=2) -#' Sigma.p2 <- matrix(c(sds[1,2]^2,0,0,sds[1,2]^2), -#' nrow=2, ncol=2) +#' Sigma.p1 <- diag(D) +#' Sigma.p2 <- 20*diag(D) #' W <- c(0.2,0.8) #' sim <- piv_sim(N = N, k = k, Mu = Mu, #' Sigma.p1 = Sigma.p1, @@ -142,6 +140,7 @@ piv_rel<-function(mcmc){ N <- dim(mcmc$groupPost)[2] if (length(dim(mcmc$mcmc_mean_raw))==3){ k <- dim(mcmc$mcmc_mean)[3] + D <- dim(mcmc$mcmc_mean)[2] }else{ k <- dim(mcmc$mcmc_mean_raw)[2] } @@ -236,8 +235,8 @@ piv_rel<-function(mcmc){ }else{ k <- dim(mu_switch)[3] - mu_rel_median <- array(NA,c(2,k)) - mu_rel_mean <- array(NA,c(2,k)) + mu_rel_median <- array(NA,c(D,k)) + mu_rel_mean <- array(NA,c(D,k)) weights_rel_median <- c() weights_rel_mean <- c() groupD2 <- groupD[contD==0,] @@ -245,7 +244,7 @@ piv_rel<-function(mcmc){ prob.st_switchD <- prob.st_switch[contD==0,] true.iterD2 <- sum(contD==0) Final_It <- true.iterD2/nMC - mu_rel_complete <- array(NA, dim=c(true.iterD2, 2,k)) + mu_rel_complete <- array(NA, dim=c(true.iterD2, D,k)) weights_rel_complete <- array(NA, dim=c(true.iterD2,k)) if (true.iterD2!=0){ @@ -264,27 +263,26 @@ piv_rel<-function(mcmc){ } - ind <- array(NA, c( nrow(mu_rel_complete) ,2, k)) + ind <- array(NA, c( nrow(mu_rel_complete) ,D, k)) for (g in 1:nrow(mu_rel_complete)){ - prel1 <- c() - prel2 <- c() + prel <- matrix(NA, D, k) + for (d in 1:D){ for (h in 1:k){ - prel1[h] <- which.min((mu_rel_complete[g,1,]-Mu[h,1])^2) - prel2[h] <- which.min((mu_rel_complete[g,2,]-Mu[h,2])^2) + prel[d,h] <- which.min((mu_rel_complete[g,d,]-Mu[h,d])^2) + #prel2[h] <- which.min((mu_rel_complete[g,2,]-Mu[h,2])^2) } - ind[g,1,] <- prel1 - ind[g,2,] <- prel2 + ind[g,d,] <- prel[d,] + } } - mu_rel_median_tr <- array(NA, c(k,2)) - mu_rel_mean_tr <- array(NA, c(k,2)) + mu_rel_median_tr <- array(NA, c(k,D)) + mu_rel_mean_tr <- array(NA, c(k,D)) +for (d in 1:D){ for (h in 1:nrow(mu_rel_complete)){ - mu_rel_complete[h,1,] <- mu_rel_complete[h,1, ind[h,1,]] - mu_rel_complete[h,2,] <- mu_rel_complete[h,2, ind[h,2,]] - #weights_rel_complete[h,1] <- weights_rel_complete[h,ind[h,1,]] - #weights_rel_complete[h,2] <- weights_rel_complete[h,ind[h,2,]] - } + mu_rel_complete[h,d,] <- mu_rel_complete[h,d, ind[h,d,]] + } +} mu_rel_median <- apply(mu_rel_complete, c(2,3), median) mu_rel_mean <- apply(mu_rel_complete, c(2,3), mean) mu_rel_median_tr <- t(mu_rel_median) diff --git a/R/sim.R b/R/sim.R index c6f5291..0ad289b 100644 --- a/R/sim.R +++ b/R/sim.R @@ -1,7 +1,7 @@ #' Generate Data from a Gaussian Nested Mixture #' -#' Simulate N observations from a nested Gaussian mixture model -#' with k pre-specified components under uniform group probabilities \eqn{1/k}, +#' Simulate \eqn{N} observations from a nested Gaussian mixture model +#' with \eqn{k} pre-specified components under uniform group probabilities \eqn{1/k}, #' where each group is in turn #' drawn from a further level consisting of two subgroups. #' @@ -9,14 +9,14 @@ #' @param N The desired sample size. #' @param k The desired number of mixture components. #' @param Mu The input mean vector of length \eqn{k} for univariate -#' Gaussian mixtures; the input \eqn{k \times 2} matrix with the -#' means' coordinates for bivariate Gaussian mixtures. +#' Gaussian mixtures; the input \eqn{k \times D} matrix with the +#' means' coordinates for multivariate Gaussian mixtures. #' @param stdev For univariate mixtures, the \eqn{k \times 2} matrix #' of input standard deviations, #' where the first column contains the parameters for subgroup 1, #' and the second column contains the parameters for subgroup 2. -#' @param Sigma.p1 The \eqn{2 \times 2} covariance matrix for the first subgroup. For bivariate mixtures only. -#' @param Sigma.p2 The \eqn{2 \times 2} covariance matrix for the second subgroup. For bivariate mixtures only. +#' @param Sigma.p1 The \eqn{D \times D} covariance matrix for the first subgroup. For multivariate mixtures only. +#' @param Sigma.p2 The \eqn{D \times D} covariance matrix for the second subgroup. For multivariate mixtures only. #' @param W The vector for the mixture weights of the two subgroups. #' @return #' @@ -36,10 +36,10 @@ #' (Y_i|Z_i=j) \sim \sum_{s=1}^{2} p_{js}\, \mathcal{N}(\mu_{j}, \sigma^{2}_{js}), #' } #' -#' or from a bivariate nested Gaussian mixture: +#' or from a multivariate nested Gaussian mixture: #' #' \deqn{ -#' (Y_i|Z_i=j) \sim \sum_{s=1}^{2} p_{js}\, \mathcal{N}_{2}(\bm{\mu}_{j}, \Sigma_{s}), +#' (Y_i|Z_i=j) \sim \sum_{s=1}^{2} p_{js}\, \mathcal{N}_{D}(\bm{\mu}_{j}, \Sigma_{s}), #' } #' #' where \eqn{\sigma^{2}_{js}} is the variance for the group \eqn{j} and @@ -60,15 +60,13 @@ #' #' N <- 2000 #' k <- 3 +#' D <- 2 #' M1 <- c(-45,8) #' M2 <- c(45,.1) #' M3 <- c(100,8) -#' Mu <- matrix(rbind(M1,M2,M3),c(k,2)) -#' sds <- cbind(rep(1,k), rep(20,k)) -#' Sigma.p1 <- matrix(c( sds[1,1]^2, 0,0, -#' sds[1,1]^2), nrow=2, ncol=2) -#' Sigma.p2 <- matrix(c(sds[1,2]^2, 0,0, -#' sds[1,2]^2), nrow=2, ncol=2) +#' Mu <- rbind(M1,M2,M3) +#' Sigma.p1 <- diag(D) +#' Sigma.p2 <- 20*diag(D) #' W <- c(0.2,0.8) #' sim <- piv_sim(N = N, k = k, Mu = Mu, Sigma.p1 = Sigma.p1, #' Sigma.p2 = Sigma.p2, W = W) @@ -80,8 +78,8 @@ piv_sim <- function(N, k, Mu, stdev, - Sigma.p1 = matrix(c(1,0,0,1),2,2, byrow = TRUE), - Sigma.p2 = matrix(c(100,0,0,100),2,2, byrow = TRUE), + Sigma.p1 = diag(2), + Sigma.p2 = 100*diag(2), W = c(0.5, 0.5)){ # Generation--------------- @@ -161,9 +159,13 @@ piv_sim <- function(N, }else{ - + D <- dim(Mu)[2] + if (dim(Sigma.p1)[2] + dim(Sigma.p2)[2] != 2*D ){ + stop("The number of dimensions in the covariance + matrices is not well posed") + } true.group <- sample(1:k,N,replace=TRUE,prob=rep(1/k,k)) - Spike <- array(c(Sigma.p1,Sigma.p2), dim=c(2,2,2)) + Spike <- array(c(Sigma.p1,Sigma.p2), dim=c(D,D,2)) # Probability matrix of subgroups matrixpi <- matrix(rep(W,k), nrow=k, ncol=2, byrow = T) sotto.gruppi <- matrix(0, nrow=k, ncol=N) @@ -174,7 +176,7 @@ piv_sim <- function(N, } # Simulation of N units from a mixture of mixtures - y <- matrix(NA,nrow=N,ncol=2) + y <- matrix(NA,nrow=N,ncol=D) for (i in 1:length(true.group)){ y[i,] <- mvrnorm(1, Mu[true.group[i],], diff --git a/build/vignette.rds b/build/vignette.rds index 805bfd2..c530895 100644 Binary files a/build/vignette.rds and b/build/vignette.rds differ diff --git a/inst/doc/K-means_clustering_using_the_MUS_algorithm.R b/inst/doc/K-means_clustering_using_the_MUS_algorithm.R index 84b7425..459e1fc 100644 --- a/inst/doc/K-means_clustering_using_the_MUS_algorithm.R +++ b/inst/doc/K-means_clustering_using_the_MUS_algorithm.R @@ -6,6 +6,7 @@ knitr::opts_chunk$set( ## ----load, warning =FALSE, message=FALSE--------------------------------- library(pivmet) +library(mvtnorm) ## ----mus, echo =TRUE, eval = TRUE, message = FALSE, warning = FALSE------ #generate some data @@ -20,9 +21,9 @@ x <- matrix(NA, n,2) truegroup <- c( rep(1,n1), rep(2, n2), rep(3, n3)) -x[1:n1,]=rmvnorm(n1, c(1,5), sigma=diag(2)) -x[(n1+1):(n1+n2),]=rmvnorm(n2, c(4,0), sigma=diag(2)) -x[(n1+n2+1):(n1+n2+n3),]=rmvnorm(n3,c(6,6),sigma=diag(2)) +x[1:n1,] <- rmvnorm(n1, c(1,5), sigma=diag(2)) +x[(n1+1):(n1+n2),] <- rmvnorm(n2, c(4,0), sigma=diag(2)) +x[(n1+n2+1):(n1+n2+n3),] <- rmvnorm(n3,c(6,6),sigma=diag(2)) H <- 1000 a <- matrix(NA, H, n) @@ -40,13 +41,13 @@ sim_matr <- matrix(1, n,n) } } -cl <- KMeans(x, centers)$cluster +cl <- RcmdrMisc::KMeans(x, centers)$cluster mus_alg <- MUS(C = sim_matr, clusters = cl, prec_par = 5) ## ----kmeans, echo =FALSE, fig.show='hold', eval = TRUE, message = FALSE, warning = FALSE---- - kmeans_res <- KMeans(x, centers) + kmeans_res <- RcmdrMisc::KMeans(x, centers) ## ----kmeans_plots, echo =FALSE, fig.show='hold', eval = TRUE, message = FALSE, warning = FALSE---- diff --git a/inst/doc/K-means_clustering_using_the_MUS_algorithm.Rmd b/inst/doc/K-means_clustering_using_the_MUS_algorithm.Rmd index 17d780a..df5bac4 100644 --- a/inst/doc/K-means_clustering_using_the_MUS_algorithm.Rmd +++ b/inst/doc/K-means_clustering_using_the_MUS_algorithm.Rmd @@ -21,6 +21,7 @@ In this vignette we explore the K-means algorithm performed using the MUS algori ```{r load, warning =FALSE, message=FALSE} library(pivmet) +library(mvtnorm) ``` @@ -71,9 +72,9 @@ x <- matrix(NA, n,2) truegroup <- c( rep(1,n1), rep(2, n2), rep(3, n3)) -x[1:n1,]=rmvnorm(n1, c(1,5), sigma=diag(2)) -x[(n1+1):(n1+n2),]=rmvnorm(n2, c(4,0), sigma=diag(2)) -x[(n1+n2+1):(n1+n2+n3),]=rmvnorm(n3,c(6,6),sigma=diag(2)) +x[1:n1,] <- rmvnorm(n1, c(1,5), sigma=diag(2)) +x[(n1+1):(n1+n2),] <- rmvnorm(n2, c(4,0), sigma=diag(2)) +x[(n1+n2+1):(n1+n2+n3),] <- rmvnorm(n3,c(6,6),sigma=diag(2)) H <- 1000 a <- matrix(NA, H, n) @@ -91,7 +92,7 @@ sim_matr <- matrix(1, n,n) } } -cl <- KMeans(x, centers)$cluster +cl <- RcmdrMisc::KMeans(x, centers)$cluster mus_alg <- MUS(C = sim_matr, clusters = cl, prec_par = 5) @@ -104,7 +105,7 @@ mus_alg <- MUS(C = sim_matr, clusters = cl, prec_par = 5) In some situations, classical K-means fails in recognizing the *true* groups: ```{r kmeans, echo =FALSE, fig.show='hold', eval = TRUE, message = FALSE, warning = FALSE} - kmeans_res <- KMeans(x, centers) + kmeans_res <- RcmdrMisc::KMeans(x, centers) ``` diff --git a/inst/doc/K-means_clustering_using_the_MUS_algorithm.html b/inst/doc/K-means_clustering_using_the_MUS_algorithm.html index 1f4ef8d..754f526 100644 --- a/inst/doc/K-means_clustering_using_the_MUS_algorithm.html +++ b/inst/doc/K-means_clustering_using_the_MUS_algorithm.html @@ -7,19 +7,20 @@ + - + K-means clustering using MUS and other pivotal algorithms - + @@ -269,6 +292,9 @@ code > span.fu { color: #900; font-weight: bold; } code > span.er { color: #a61717; background-color: #e3d2d2; } + + + @@ -277,13 +303,14 @@

K-means clustering using MUS and other pivotal algorithms

-

Leonardo Egidi

-

2019-06-24

+

Leonardo Egidi

+

2020-06-03

In this vignette we explore the K-means algorithm performed using the MUS algorithm and other pivotal methods through the function piv_KMeans of the pivmet package. First of all, we load the package:

- +

Pivotal algorithms: how they works, and why

We present here a simulated case for applying our procedure. Given \(n\) units \(y_1,\ldots,y_n\):

@@ -302,48 +329,48 @@

Pivotal algorithms: how they works, and why

obtaining the most distant unit among the members that minimize the global dissimilarity between one group and all the others (minsumnoint).

MUS algorithm described in Egidi et al. (2018b) is a sequential procedure for extracting identity submatrices of small rank and pivotal units from large and sparse matrices. The procedure has already been satisfactorily applied for solving the label switching problem in Bayesian mixture models (Egidi et al. 2018c).

With the function MUS the user may detect pivotal units from a co-association matrix C, obtained through \(H\) different partitions, whose units may belong to \(k\) groups, expressed by the argument clusters. We remark here that MUS algorithm may be performed only when \(k <5\).

- +

piv_KMeans: k-means clustering via pivotal units

In some situations, classical K-means fails in recognizing the true groups:

-

+

For instance, when the groups are unbalanced or non-spherical shaped, we may need a more robust version of the classical K-means. The pivotal units may be used as initial seeds for K-means method (Egidi et al. 2018a). The function piv_KMeans works as the kmeans function, with some optional arguments

- -

+ +

The function piv_KMeans has optional arguments:

  • alg.type: The clustering algorithm for the initial partition of the \(N\) units into the desired number of clusters. Possible choices are "KMeans" (default) and "hclust".

  • @@ -365,13 +392,16 @@

    References

    ———. 2018b. “Maxima Units Search (Mus) Algorithm: Methodology and Applications.” In Studies in Theoretical and Applied Statistics, edited by C. Perna, M. Pratesi, and A. Ruiz-Gazen, 71–81. Springer.

-

———. 2018c. “Relabelling in Bayesian Mixture Models by Pivotal Units.” Statistics and Computing 28 (4). Springer: 957–69.

+

———. 2018c. “Relabelling in Bayesian Mixture Models by Pivotal Units.” Statistics and Computing 28 (4): 957–69.

+ + + @@ -269,6 +292,9 @@ code > span.fu { color: #900; font-weight: bold; } code > span.er { color: #a61717; background-color: #e3d2d2; } + + + @@ -277,19 +303,21 @@

Dealing with label switching: relabelling in Bayesian mixture models by pivotal units

-

Leonardo Egidi

-

2019-06-24

+

Leonardo Egidi

+

2020-06-03

In this vignette we explore the fit of Gaussian mixture models and the relabelling pivotal method proposed in Egidi et al. (2018b) through the pivmet package. First of all, we load the package:

- -

The pivmet package provides a simple framework to (i) fit univariate and bivariate mixture models according to a Bayesian flavour and detect the pivotal units, via the piv_MCMC function; (ii) perform the relabelling step via the piv_rel function.

+ +

The pivmet package provides a simple framework to (i) fit univariate and multivariate mixture models according to a Bayesian flavour and detect the pivotal units, via the piv_MCMC function; (ii) perform the relabelling step via the piv_rel function.

There are two main functions used for this task.

The function piv_MCMC:

  • performs MCMC sampling for Gaussian mixture models using the underlying rjags or rstan packages (chosen by the users through the optional function argument software, by default set to rjags). Precisely the package uses: the JAGSrun function of the bayesmix package for univariate mixture models; the run.jags function of the runjags package for bivariate mixture models; the stan function of the rstan package for both univariate and bivariate mixture models.

  • -
  • Post-processes the chains and randomly swithes their values.

  • +
  • Post-processes the chains and randomly switches their values (Frühwirth-Schnatter 2001).

  • Builds a co-association matrix for the \(N\) units. After \(H\) MCMC iterations, the function implements a clustering procedure with \(k\) groups, the user may choose among agglomerative or divisive hierarchical clustering through the optional argument clustering. Using the latent formulation for mixture models, we denote with \([z_i]_h\) the group allocation of the \(i\)-th unit at the \(h\)-th iteration. In such a way, a co-association matrix \(C\) for the \(N\) statistical units is built, where the single cell \(c_{ip}\) is the fraction of times the unit \(i\) and the unit \(p\) belong to the same group along the \(H\) MCMC iterations:

\[c_{ip} = \frac{n_{ip}}{H}=\frac{1}{H} \sum_{h=1}^{H}|[z_i]_h=[z_p]_h|,\] where \(|\cdot|\) denotes the event indicator and \(n_{ip}\) is the number of times the units \(i, \ p\) belong to the same group along the sampling.

@@ -312,227 +340,235 @@

2019-06-24

Example: bivariate Gaussian data

Suppose now that \(\boldsymbol{y}_i \in \mathbb{R}^2\) and assume that:

-

\[\boldsymbol{y}_i \sim \sum_{j=1}^{k}\pi_{j}\mathcal{N}_{2}(\boldsymbol{\mu}_{j}, \boldsymbol{\Sigma})\] where \(\boldsymbol{\mu}_j\) is the mean vector for group \(j\), \(\boldsymbol{\Sigma}\) is a positive-definite covariance matrix and the mixture weight \(\pi_j= P(z_i=j)\) as for the one-dimensional case. We may generate Gaussian mixture data through the function piv_sim, specifying the sample size \(N\), the desired number of groups \(k\) and the \(k \times 2\) matrix for the \(k\) mean vectors. The argument W handles the weights for a nested mixture, in which the \(j\)-th component is in turn modelled as a two-component mixture, with covariance matrices \(\boldsymbol{\Sigma}_{p1}, \boldsymbol{\Sigma}_{p2}\), respectively.

- +

\[\boldsymbol{y}_i \sim \sum_{j=1}^{k}\eta_{j}\mathcal{N}_{2}(\boldsymbol{\mu}_{j}, \boldsymbol{\Sigma})\] where \(\boldsymbol{\mu}_j\) is the mean vector for group \(j\), \(\boldsymbol{\Sigma}\) is a positive-definite covariance matrix and the mixture weight \(\eta_j= P(z_i=j)\) as for the one-dimensional case. We may generate Gaussian mixture data through the function piv_sim, specifying the sample size \(N\), the desired number of groups \(k\) and the \(k \times 2\) matrix for the \(k\) mean vectors. The argument W handles the weights for a nested mixture, in which the \(j\)-th component is in turn modelled as a two-component mixture, with covariance matrices \(\boldsymbol{\Sigma}_{p1}, \boldsymbol{\Sigma}_{p2}\), respectively.

+

The function piv_MCMC requires only three mandatory arguments: the data object y, the number of components k and the number of MCMC iterations, nMC. By default, it performs Gibbs sampling using the runjags package. If software="rjags", for bivariate data the priors are specified as:

\[\begin{align} \boldsymbol{\mu}_j \sim & \mathcal{N}_2(\boldsymbol{\mu}_0, S_2)\\ - 1/\Sigma \sim & \mbox{Wishart}(S_3, 3)\\ -\pi \sim & \mbox{Dirichlet}(\boldsymbol{\alpha}), + \Sigma^{-1} \sim & \mbox{Wishart}(S_3, 3)\\ +\eta \sim & \mbox{Dirichlet}(\boldsymbol{\alpha}), \end{align}\]

where \(\boldsymbol{\alpha}\) is a \(k\)-dimensional vector and \(S_2\) and \(S_3\) are positive definite matrices. By default, \(\boldsymbol{\mu}_0=\boldsymbol{0}\), \(\boldsymbol{\alpha}=(1,\ldots,1)\) and \(S_2\) and \(S_3\) are diagonal matrices, with diagonal elements equal to 1e+05. Different values can be specified for the hyperparameters \(\boldsymbol{\mu}_0, S_2, S_3\) and \(\boldsymbol{\alpha}\): priors =list(mu_0 = c(1,1), S2 = ..., S3 = ..., alpha = ...)}, with the constraint for \(S2\) and \(S3\) to be positive definite, and \(\boldsymbol{\alpha}\) a vector of dimension \(k\) with nonnegative elements.

If software="rstan", the function performs Hamiltonian Monte Carlo (HMC) sampling. In this case the priors are specified as:

\[\begin{align} - \boldsymbol{\mu}_j \sim & \mathcal{N}_2(\boldsymbol{\mu}_0, LDL^{T})\\ + \boldsymbol{\mu}_j \sim & \mathcal{N}_2(\boldsymbol{\mu}_0, LD^*L^{T})\\ L \sim & \mbox{LKJ}(\eta)\\ D_{1,2} \sim & \mbox{HalfCauchy}(0, \sigma_d). \end{align}\]

-

The covariance matrix is expressed in terms of the LDL decomposition as \(LDL^{T}\), a variant of the classical Cholesky decomposition, where \(L\) is a \(2 \times 2\) lower unit triangular matrix and \(D\) is a \(2 \times 2\) diagonal matrix. The Cholesky correlation factor \(L\) is assigned a LKJ prior with \(\eta\) degrees of freedom, which, combined with priors on the standard deviations of each component, induces a prior on the covariance matrix; as \(\eta \rightarrow \infty\) the magnitude of correlations between components decreases, whereas \(\eta=1\) leads to a uniform prior distribution for \(L\). By default, the hyperparameters are \(\boldsymbol{\mu}_0=\boldsymbol{0}\), \(\sigma_d=2.5, \eta=1\). Different values can be chosen with the argument: priors=list(mu_0=c(1,2), sigma_d = 4, eta = 2).

+

The covariance matrix is expressed in terms of the LDL decomposition as \(LD^*L^{T}\), a variant of the classical Cholesky decomposition, where \(L\) is a \(2 \times 2\) lower unit triangular matrix and \(D^*\) is a \(2 \times 2\) diagonal matrix. The Cholesky correlation factor \(L\) is assigned a LKJ prior with \(\epsilon\) degrees of freedom, which, combined with priors on the standard deviations of each component, induces a prior on the covariance matrix; as \(\epsilon \rightarrow \infty\) the magnitude of correlations between components decreases, whereas \(\epsilon=1\) leads to a uniform prior distribution for \(L\). By default, the hyperparameters are \(\boldsymbol{\mu}_0=\boldsymbol{0}\), \(\sigma_d=2.5, \epsilon=1\). Different values can be chosen with the argument: priors=list(mu_0=c(1,2), sigma_d = 4, epsilon = 2).

We fit the model using rjags with 2000 MCMC iterations and default priors:

-
res <- piv_MCMC(y = sim$y, k= k, nMC =nMC)
-#> Compiling rjags model...
-#> Calling the simulation using the rjags method...
-#> Note: the model did not require adaptation
-#> Burning in the model for 1000 iterations...
-#> Running the model for 2000 iterations...
-#> Simulation complete
-#> Note: Summary statistics were not produced as there are >50
-#> monitored variables
-#> [To override this behaviour see ?add.summary and ?runjags.options]
-#> FALSEFinished running the simulation
-#> Calculating summary statistics...
-#> Calculating the Gelman-Rubin statistic for 13 variables....
-#> Note: Unable to calculate the multivariate psrf
-#> 
-#> JAGS model summary statistics from 8000 samples (chains = 4; adapt+burnin = 2000):
-#>                                                                       
-#>                     Lower95     Median   Upper95       Mean         SD
-#> muOfClust[1,1]      -470.69     2.8073    507.24     1.6545     204.78
-#> muOfClust[2,1]      -363.18     11.632    312.82     6.0789     134.95
-#> muOfClust[3,1]       -409.7     19.733    368.76     10.937      157.8
-#> muOfClust[1,2]      -493.15     5.4373    477.64     2.9288      205.2
-#> muOfClust[2,2]      -307.29      5.532    364.92     5.1804     130.89
-#> muOfClust[3,2]         -365     5.4355    395.23     9.8295     150.68
-#> tauOfClust[1,1]   0.0020346  0.0049794 0.0072836  0.0046974  0.0016869
-#> tauOfClust[2,1] -0.00084287 0.00075679 0.0024986 0.00081325 0.00083372
-#> tauOfClust[1,2] -0.00084287 0.00075679 0.0024986 0.00081325 0.00083372
-#> tauOfClust[2,2]   0.0044738  0.0065349  0.013155  0.0071613  0.0024919
-#> pClust[1]         1.721e-06   0.088032   0.75833    0.24344    0.26825
-#> pClust[2]        2.6503e-06    0.46075   0.98999    0.43099    0.29444
-#> pClust[3]        1.2463e-07    0.39488   0.68405    0.32557    0.25832
-#>                                                                
-#>                 Mode       MCerr MC%ofSD SSeff     AC.10   psrf
-#> muOfClust[1,1]    --      2.3018     1.1  7914   -0.0121 1.0547
-#> muOfClust[2,1]    --      2.2922     1.7  3466   0.48907 1.1321
-#> muOfClust[3,1]    --      2.4386     1.5  4187   0.46573 1.1429
-#> muOfClust[1,2]    --      2.0273       1 10245 -0.038876 1.0639
-#> muOfClust[2,2]    --      1.8798     1.4  4848    0.2286 1.1338
-#> muOfClust[3,2]    --      2.5275     1.7  3554   0.30636 1.1493
-#> tauOfClust[1,1]   --  0.00014724     8.7   131   0.63828 1.2493
-#> tauOfClust[2,1]   -- 0.000032446     3.9   660   0.22476 1.0227
-#> tauOfClust[1,2]   -- 0.000032446     3.9   660   0.22476 1.0227
-#> tauOfClust[2,2]   --  0.00010136     4.1   604   0.49173 1.1497
-#> pClust[1]         --    0.048223      18    31    0.9082 1.3505
-#> pClust[2]         --    0.037769    12.8    61    0.8433 1.3142
-#> pClust[3]         --    0.043474    16.8    35   0.90465 1.3719
-#> 
-#> Total time taken: 33.3 seconds
+
res <- piv_MCMC(y = sim$y, k= k, nMC =nMC, 
+                piv.criterion = "maxsumdiff")
+#> Compiling rjags model...
+#> Calling the simulation using the rjags method...
+#> Note: the model did not require adaptation
+#> Burning in the model for 1000 iterations...
+#> Running the model for 2000 iterations...
+#> Simulation complete
+#> Note: Summary statistics were not produced as there are >50
+#> monitored variables
+#> [To override this behaviour see ?add.summary and ?runjags.options]
+#> FALSEFinished running the simulation
+#> Calculating summary statistics...
+#> Calculating the Gelman-Rubin statistic for 13 variables....
+#> Note: Unable to calculate the multivariate psrf
+#> 
+#> JAGS model summary statistics from 8000 samples (chains = 4; adapt+burnin = 2000):
+#>                                                                   
+#>             Lower95     Median   Upper95       Mean        SD Mode
+#> mu[1,1]     -14.375    -11.558   -8.7572    -11.508    1.4387   --
+#> mu[2,1]      6.3863     9.1717    11.857     9.1522     1.405   --
+#> mu[3,1]      28.183     31.396    34.365     31.293    1.6088   --
+#> mu[1,2]      4.4457     6.9535    9.6217     6.9353    1.3256   --
+#> mu[2,2]     -2.0663    0.66669    3.1435    0.59162    1.3823   --
+#> mu[3,2]      2.4112     5.5864    8.8895     5.5697    1.6513   --
+#> tau[1,1]   0.012012   0.018326  0.025218   0.018277 0.0032905   --
+#> tau[2,1] -0.0027039 0.00068687 0.0041266 0.00073708 0.0017354   --
+#> tau[1,2] -0.0027039 0.00068687 0.0041266 0.00073708 0.0017354   --
+#> tau[2,2]  0.0086829   0.010996  0.013493   0.011059 0.0012145   --
+#> eta[1]       0.2443    0.33107   0.41591    0.33154  0.044196   --
+#> eta[2]       0.3231    0.42022   0.51657    0.41885  0.049808   --
+#> eta[3]      0.17876    0.24766   0.32376     0.2496  0.037867   --
+#>                                                    
+#>                MCerr MC%ofSD SSeff     AC.10   psrf
+#> mu[1,1]     0.039294     2.7  1341  0.094617 1.0084
+#> mu[2,1]     0.039178     2.8  1286  0.040806 1.0044
+#> mu[3,1]     0.046847     2.9  1179   0.14019 1.0087
+#> mu[1,2]     0.020291     1.5  4268  0.016887 1.0003
+#> mu[2,2]     0.029752     2.2  2159   0.10913 1.0059
+#> mu[3,2]     0.030215     1.8  2987 0.0025848 1.0004
+#> tau[1,1]  0.00010956     3.3   902    0.1751 1.0127
+#> tau[2,1] 0.000046848     2.7  1372  0.032665 1.0006
+#> tau[1,2] 0.000046848     2.7  1372  0.032665 1.0006
+#> tau[2,2] 0.000018206     1.5  4450  0.021152  1.003
+#> eta[1]     0.0011124     2.5  1578  0.052196 1.0015
+#> eta[2]     0.0012608     2.5  1561   0.11146  1.006
+#> eta[3]    0.00078429     2.1  2331  0.071984 1.0018
+#> 
+#> Total time taken: 32.4 seconds

Once we obtain posterior estimates, label switching is likely to occurr. For such a reason, we need to relabel our chains as explained above. In order to relabel the chains, the function piv_rel can be used, which only needs the mcmc = res argument. Relabelled outputs can be displayed through the function piv_plot, with different options for the argument type:

  • chains: plot the relabelled chains;
  • hist: plot the point estimates against the histogram of the data.

The optional argument par takes four possible alternative choices: mean, sd, weight and all for the means, standard deviations, weights or all the three mentioned parameters, respectively. By default, par="all".

- -

+ +

Example: fishery data

The Fishery dataset has been previously used by Titterington, Smith, and Makov (1985) and Papastamoulis (2016) and consists of 256 snapper length measurements. It is contained in the bayesmix package (Grün 2011). We may display the histogram of the data, along with an estimated kernel density.

- +

We assume a mixture model with \(k=5\) groups:

\[\begin{equation} -y_i \sim \sum_{j=1}^k \pi_j \mathcal{N}(\mu_j, \phi^2_j), \ \ i=1,\ldots,n, +y_i \sim \sum_{j=1}^k \eta_j \mathcal{N}(\mu_j, \sigma^2_j), \ \ i=1,\ldots,n, \label{eq:fishery} - \end{equation}\] where \(\mu_j, \phi_j\) are the mean and the standard deviation of group \(j\), respectively. Moreover, assume that \(z_1,\ldots,z_n\) is an unobserved latent sequence of i.i.d. random variables following the multinomial distribution with weights \(\boldsymbol{\pi}=(\pi_{1},\dots,\pi_{k})\), such that:

-

\[P(z_i=j)=\pi_j,\] where \(\pi_j\) is the mixture weight assigned to the group \(j\).

-

We fit our model by simulating \(H=15000\) samples from the posterior distribution of \((\boldsymbol{z}, \boldsymbol{\mu}, \boldsymbol{\phi}, \boldsymbol{\pi})\). In the univariate case, if the argument software="rjags" is selected (the default option), Gibbs sampling is performed by the package bayesmix, and the priors are:

+ \end{equation}\] where \(\mu_j, \sigma_j\) are the mean and the standard deviation of group \(j\), respectively. Moreover, assume that \(z_1,\ldots,z_n\) is an unobserved latent sequence of i.i.d. random variables following the multinomial distribution with weights \(\boldsymbol{\eta}=(\eta_{1},\dots,\eta_{k})\), such that:

+

\[P(z_i=j)=\eta_j,\] where \(\eta_j\) is the mixture weight assigned to the group \(j\).

+

We fit our model by simulating \(H=15000\) samples from the posterior distribution of \((\boldsymbol{z}, \boldsymbol{\mu}, \boldsymbol{\sigma}, \boldsymbol{\eta})\). In the univariate case, if the argument software="rjags" is selected (the default option), Gibbs sampling is performed by the package bayesmix, and the priors are:

\[\begin{eqnarray} \mu_j \sim & \mathcal{N}(\mu_0, 1/B_0)\\ - \phi_j \sim & \mbox{invGamma}(\nu_0/2, \nu_0S_0/2)\\ - \pi \sim & \mbox{Dirichlet}(\boldsymbol{\alpha})\\ + \sigma_j \sim & \mbox{invGamma}(\nu_0/2, \nu_0S_0/2)\\ + \eta \sim & \mbox{Dirichlet}(\boldsymbol{\alpha})\\ S_0 \sim & \mbox{Gamma}(g_0/2, g_0G_0/2), \end{eqnarray}\]

with default values: \(B_0=0.1\), \(\nu_0 =20\), \(g_0 = 10^{-16}\), \(G_0 = 10^{-16}\), \(\boldsymbol{\alpha}=(1,1,\ldots,1)\). The users may specify their own hyperparameters with the priors arguments, declaring a names list such as: priors = list(mu_0=2, alpha = rep(2, k), ...).

If software="rstan" is selected, the priors are:

\[\begin{eqnarray} \mu_j & \sim \mathcal{N}(\mu_0, 1/B0inv)\\ - \phi_j & \sim \mbox{Lognormal}(\mu_{\phi}, \sigma_{\phi})\\ - \pi_j & \sim \mbox{Uniform}(0,1), + \sigma_j & \sim \mbox{Lognormal}(\mu_{\sigma}, \tau_{\sigma})\\ + \eta_j & \sim \mbox{Uniform}(0,1), \end{eqnarray}\]

-

where the vector of the weights \(\boldsymbol{\pi}=(\pi_1,\ldots,\pi_k)\) is a \(k\)-simplex. Default hyperparameters values are: \(\mu_0=0, B0inv=0.1, \mu_{\phi}=0, \sigma_{\phi}=2\). Here also, the users may choose their own hyperparameters values in the following way: priors = list(mu_phi = 0, sigma_phi = 1,...).

+

where the vector of the weights \(\boldsymbol{\eta}=(\eta_1,\ldots,\eta_k)\) is a \(k\)-simplex. Default hyperparameters values are: \(\mu_0=0, B0inv=0.1, \mu_{\sigma}=0, \tau_{\sigma}=2\). Here also, the users may choose their own hyperparameters values in the following way: priors = list(mu_sigma = 0, tau_sigma = 1,...).

We fit the model using the rjags method, and we set the burnin period to 7500.

- +

First of all, we may access the true number of iterations by tiping:

- +

We may have a glimpse if label switching ocurred or not by looking at the traceplot for the mean parameters, \(\mu_j\). To do this, we apply the function piv_rel to relabel the chains and obtain useful inferences; the only argument for this function is the MCMC result just obtained with piv_MCMC. The function piv_plot displays some graphical tools, both traceplots (argument type="chains") and histograms along with the final relabelled means (argument type="hist"). For both plot ttpes, the function returns a printed message explaining how to interpret the results.

- +

#> Description: traceplot of the raw MCMC chains and the relabelled chains for the means parameters. Each colored chain corresponds to one of the k distinct parameters of the mixture model. Overlapping chains may reveal that the MCMC sample is not able to distinguish between the components.
 piv_plot(y=y, res, rel, type="hist")

#> Description: histograms of the data along with the estimated posterior means (red points) from raw MCMC and relabelling algorithm. The blue line is the estimated density curve.
-

The first plot displays the traceplots for the parameters \(\boldsymbol{\mu}\). From the left plot showing the raw outputs as given by the Gibbs sampling, we note that label switching clearly occurred. Our algorithm seems able to reorder the mean \(\mu_j\) and the weights \(\pi_j\), for \(j=1,\ldots,k\). Of course, a MCMC sampler which does not switch the labels would ideal, but nearly impossible to program. However, we could assess how two diferent sampler perform, by repeating the analysis above by selecting software="rstan" in the piv_MCMC function.

+

The first plot displays the traceplots for the parameters \(\boldsymbol{\mu}\). From the left plot showing the raw outputs as given by the Gibbs sampling, we note that label switching clearly occurred. Our algorithm seems able to reorder the mean \(\mu_j\) and the weights \(\eta_j\), for \(j=1,\ldots,k\). Of course, a MCMC sampler which does not switch the labels would ideal, but nearly impossible to program. However, we could assess how two diferent sampler perform, by repeating the analysis above by selecting software="rstan" in the piv_MCMC function.

+ +

With the rstan option, we can use the bayesplot functions on the argument:

+

Regardless of the software that we chose, we may extract the JAGS/Stan model by typing:

- +
@@ -542,7 +578,10 @@

References

Egidi, Leonardo, Roberta Pappadà, Francesco Pauli, and Nicola Torelli. 2018a. “Maxima Units Search (Mus) Algorithm: Methodology and Applications.” In Studies in Theoretical and Applied Statistics, edited by C. Perna, M. Pratesi, and A. Ruiz-Gazen, 71–81. Springer.

-

———. 2018b. “Relabelling in Bayesian Mixture Models by Pivotal Units.” Statistics and Computing 28 (4). Springer: 957–69.

+

———. 2018b. “Relabelling in Bayesian Mixture Models by Pivotal Units.” Statistics and Computing 28 (4): 957–69.

+
+
+

Frühwirth-Schnatter, Sylvia. 2001. “Markov Chain Monte Carlo Estimation of Classical and Dynamic Switching and Mixture Models.” Journal of the American Statistical Association 96 (453): 194–209.

Grün, B. 2011. “bayesmix: Bayesian Mixture Models with JAGS. R Package Version 0.7-2.” https://CRAN.R-project.org/package=bayesmix.

@@ -558,6 +597,9 @@

References

+ + +