nimble-dev · paciorek · Jul 10, 2025 · Jun 20, 2025 · Jun 20, 2025 · Jun 27, 2025
diff --git a/UserManual/src/chapter_BNP.Rmd b/UserManual/src/chapter_BNP.Rmd
@@ -13,25 +13,27 @@ require(nimble)
 
 ## Bayesian nonparametric mixture models {#sec:bnpmixtures}
 
-NIMBLE provides support  for Bayesian nonparametric (BNP) mixture modeling.  The current implementation provides support for hierarchical specifications involving Dirichlet process (DP) mixtures [@ferguson_73;@ferguson_74;@lo_84;@escobar_94;@escobar_west_95].   More specifically, a DP mixture model takes the form
+NIMBLE provides support  for Bayesian nonparametric (BNP) mixture modeling.  The current implementation provides support for hierarchical specifications involving Dirichlet process (DP) mixtures [@ferguson_73;@ferguson_74;@lo_84;@escobar_94;@escobar_west_95]. These allow one to avoid specifying a particular parametric distribution for a given node (parameter) in a model. Instead one can use a DP mixture as a much more general, nonparametric distribution. For example, a normal distribution for a random effect could be replaced by a DP mixture of normal distributions, with the number of components of the mixture being determined from the data and not fixed in advance.
+
+We'll first introduce the general, technical definition of a DP mixture model before describing the Chinese Restaurant Process representation, which may be more interpretable for many readers. More specifically, a DP mixture model for a random variable $y_i$ takes the form
 
 $$y_i \mid G \overset{iid}{\sim} \int h(y_i \mid \theta) G(d\theta),$$
 $$G \mid \alpha, G_0 \sim  DP(\alpha, G_0),$$
 
-where $h(\cdot \mid \theta)$ is a suitable kernel with parameter $\theta$,  and  $\alpha$  and $G_0$ are the concentration and  baseline distribution parameters of the DP, respectively. DP mixture models can be written with different levels of hierarchy, all being equivalent to the model above.
+where $h(\cdot \mid \theta)$ is a suitable kernel (i.e., probability density/mass function) with parameter $\theta$,  and  $\alpha$  and $G_0$ are the concentration and  baseline distribution parameters of the DP, respectively. DP mixture models can be written with different levels of hierarchy, all being equivalent to the model above. While "y" would often be used as notation for a data value, it is used generically here, noting that often DP mixtures are used for random effects rather than directly for observations.
 
-When the random measure $G$ is integrated out from the model, the DP mixture model can be written using  latent or membership variables, $z_i$,  following a Chinese Restaurant Process (CRP) distribution [@blackwell_mcqueen_73], discussed in Section \@ref(sec:crp).  The model takes the form 
+When the random distribution (also referred to as a random 'measure') $G$ is integrated out from the model, the DP mixture model can be written using  latent or membership variables, $z_i$,  following a Chinese Restaurant Process (CRP) distribution [@blackwell_mcqueen_73], discussed in Section \@ref(sec:crp).  The model takes the form 
 
 $$y_i \mid \tilde{\boldsymbol{\theta}}, z_i \overset{ind}{\sim} h(\cdot \mid \tilde{\theta}_{z_i}),$$
 $$\boldsymbol{z}\mid \alpha \sim \mbox{CRP}(\alpha),\hspace{0.5cm}  \tilde{\theta}_j \overset{iid}{\sim}G_0,$$
-where $\mbox{CRP}(\alpha)$ denotes the CRP  distribution with concentration parameter $\alpha$.
+where $\mbox{CRP}(\alpha)$ denotes the CRP  distribution with concentration parameter $\alpha$. Put in perhaps more intuitive terms, $z_i$ says which cluster/group the $i$th unit is in, and the parameter $\tilde{\theta}_j$ for group $j$ is distributed according to the $G_0$ baseline distribution. The parameter $\alpha$ controls how dispersed the clustering is, described more in the next section.
 
-If a stick-breaking representation [@sethuraman_94], discussed in section \@ref(sec:sb), is  assumed for the random measure $G$, then the model takes the form
+If a stick-breaking representation [@sethuraman_94], discussed in section \@ref(sec:sb), is  assumed for the random distribution (measure) $G$, then the model takes the form
 
 $$y_i \mid {\boldsymbol{\theta}}^{\star}, \boldsymbol{v} \overset{ind}{\sim} \sum_{l=1}^{\infty}\left\{ v_l\prod_{m<l}(1-v_m)\right\} h(\cdot \mid {\theta}_l^{\star}),$$
 $$v_l \mid \alpha \overset{iid}{\sim} Beta(1, \alpha),\hspace{0.5cm}  {\theta}_l^{\star} \overset{iid}{\sim}G_0.$$
 
-\noindent More general representations of the random measure can be specify by considering $v_l \mid \nu_l, \alpha_l \overset{ind}{\sim} Beta(\nu_l, \alpha_l)$. Finite dimensional approximations can be obtained by truncating the infinite sum to have $L$ components. 
+\noindent More general representations of the random distribution can be specify by considering $v_l \mid \nu_l, \alpha_l \overset{ind}{\sim} Beta(\nu_l, \alpha_l)$. Finite dimensional approximations can be obtained by truncating the infinite sum to have $L$ components. 
 
 Different representations of  DP mixtures lead to different computational algorithms. NIMBLE supports sampling algorithms based on the CRP representation, as well as on the stick-breaking representation.  NIMBLE includes definitions of structures required to implement the CRP and stick-breaking distributions, and  the associated MCMC algorithms.
 

diff --git a/UserManual/src/chapter_InstallingNimble.Rmd b/UserManual/src/chapter_InstallingNimble.Rmd
@@ -88,6 +88,15 @@ method if you download the package source directly.
 
 NIMBLE can also be obtained from the [NIMBLE website](https://r-nimble.org). To install from our website, please see our [Download page](https://r-nimble.org/download) for the specific invocation of `install.packages`.
 
+To test that the installation worked and can use NIMBLE's compilation system, you can run this small test code, which should run without error and produce a value for `y` in the model.
+
+```{r, eval=FALSE}
+code <- nimbleCode({ y ~ dnorm(0,1) })
+model <- nimbleModel(code)
+comp_model <- compileNimble(model)
+comp_model$simulate('y')
+comp_model$y
+```
 
 ## Troubleshooting installation problems
 
@@ -125,6 +134,7 @@ For MacOS:
 
 All operating systems:
 
+  - To determine if the problem is with the availability of the C++ compiler or with NIMBLE itself, you can try to install the `Rcpp` package. If you can install `Rcpp` and successfully run this command in R: `Rcpp::evalCpp("2 + 2")` and get 4, that suggests the problem is with NIMBLE. If not, then the problem is probably with the C++ compiler or its use from R and not with NIMBLE itself.
   - If problems arise from generating and compiling C++ files from the default location in R's `tempdir()`, one can use the `dirName` argument to `compileNimble` to put such files elsewhere, such as in a local working directory. 
 
 If those suggestions don't help, please post about installation problems to the [nimble-users Google group](https://groups.google.com/forum/#!forum/nimble-users) or 

diff --git a/UserManual/src/chapter_MCMC.Rmd b/UserManual/src/chapter_MCMC.Rmd
@@ -881,7 +881,7 @@ If `fixedValue` is given when using `indicatorNodes` the values provided in `fix
 
 
 
-NIMBLE provides a variety of other commonly-used samplers, some for general-purpose use and some for specialized use with particular distributions or situations. Some that are worth particular consideration are:
+NIMBLE provides a variety of other commonly-used samplers, some for general-purpose use and some for specialized use with particular distributions or situations. Some that are worth particular consideration as they may perform better than the samplers that NIMBLE assigns by default are:
 
   1. the Hamiltonian Monte Carlo (HMC) sampler in the `nimbleHMC` package,
   2. the slice sampler for sampling scalar parameters in place of the default `RW` (Metropolis) sampler,

diff --git a/packages/nimble/R/BNP_samplers.R b/packages/nimble/R/BNP_samplers.R
@@ -14,12 +14,12 @@ getNsave <- nimbleFunction(
 ##  Wrapper function for sampleDPmeasure
 ##-----------------------------------------
 
-#' Get posterior samples for a Dirichlet process measure
+#' Get posterior samples for a Dirichlet process distribution (measure)
 #'
-#' This function obtains posterior samples from a Dirichlet process distributed random measure of a model specified using the \code{dCRP} distribution.
+#' This function obtains posterior samples from a Dirichlet process distributed random distribution (measure) of a model specified using the \code{dCRP} distribution.
 #'
 #' @param MCMC an MCMC class object, either compiled or uncompiled.
-#' @param epsilon  used for determining the truncation level of the representation of the random measure.
+#' @param epsilon  used for determining the truncation level of the representation of the random distribution (measure).
 #' @param setSeed Logical or numeric argument. If a single numeric value is provided, R's random number seed will be set to this value. In the case of a logical value, if \code{TRUE}, then R's random number seed will be set to \code{1}. Note that specifying the argument \code{setSeed = 0} does not prevent setting the RNG seed, but rather sets the random number generation seed to \code{0}.  Default value is \code{FALSE}.
 #'
 #' @param progressBar Logical specifying whether to display a progress bar during execution (default = TRUE).  The progress bar can be permanently disabled by setting the system option \code{nimbleOptions(MCMCprogressBar = FALSE)}
@@ -28,15 +28,15 @@ getNsave <- nimbleFunction(
 #' 
 #' @export
 #' @details
-#' This function provides samples from a random measure having a Dirichlet process prior. Realizations are almost surely discrete and represented by a (finite) stick-breaking representation (Sethuraman, 1994), whose atoms (or point masses) are independent and identically distributed. This sampler can only be used with models containing a \code{dCRP} distribution. 
+#' This function provides samples from a random distribution (measure) having a Dirichlet process prior. Realizations are almost surely discrete and represented by a (finite) stick-breaking representation (Sethuraman, 1994), whose atoms (or point masses) are independent and identically distributed. This sampler can only be used with models containing a \code{dCRP} distribution. 
 #'
-#' The \code{MCMC} argument is an object of class MCMC provided by \code{buildMCMC}, or its compiled version. The MCMC should already have been run, as \code{getSamplesDPmeasure} uses the posterior samples to generate samples of the random measure. Note that the monitors associated with that MCMC must include the cluster membership variable (which has the \code{dCRP} distribution), the cluster parameter variables, all variables directly determining the \code{dCRP} concentration parameter, and any stochastic parent variables of the cluster parameter variables. See \code{help(configureMCMC)} or \code{help(addMonitors)} for information on specifying monitors for an MCMC.
+#' The \code{MCMC} argument is an object of class MCMC provided by \code{buildMCMC}, or its compiled version. The MCMC should already have been run, as \code{getSamplesDPmeasure} uses the posterior samples to generate samples of the random distribution (measure). Note that the monitors associated with that MCMC must include the cluster membership variable (which has the \code{dCRP} distribution), the cluster parameter variables, all variables directly determining the \code{dCRP} concentration parameter, and any stochastic parent variables of the cluster parameter variables. See \code{help(configureMCMC)} or \code{help(addMonitors)} for information on specifying monitors for an MCMC.
 #' 
 #' The \code{epsilon} argument is optional and used to determine the truncation level of the random measure. \code{epsilon} is the tail probability of the random measure, which together with posterior samples of the concentration parameter, determines the truncation level. The default value is 1e-4.
 #'  
-#' The output is a list of matrices. Each matrix represents a sample from the random measure. In order to reduce the output's dimensionality, the weights of identical atoms are added up. The stick-breaking weights are named \code{weights} and the atoms are named based on the cluster variables in the model.
+#' The output is a list of matrices. Each matrix represents a sample from the random distribution (measure). In order to reduce the output's dimensionality, the weights of identical atoms are added up. The stick-breaking weights are named \code{weights} and the atoms are named based on the cluster variables in the model.
 #' 
-#' For more details about sampling the random measure and determining its truncation level, see Section 3 in Gelfand, A.E. and Kottas, A. 2002.
+#' For more details about sampling the random distribution (measure) and determining its truncation level, see Section 3 in Gelfand, A.E. and Kottas, A. 2002.
 #' 
 #' @seealso \code{\link{buildMCMC}}, \code{\link{configureMCMC}}, 
 #' @references

diff --git a/packages/nimble/R/MCMC_samplers.R b/packages/nimble/R/MCMC_samplers.R
@@ -3513,7 +3513,7 @@ sampler_barker <- nimbleFunction(
 
 #' MCMC Sampling Algorithms
 #'
-#' Details of the MCMC sampling algorithms provided with the NIMBLE MCMC engine; HMC samplers are in the \code{nimbleHMC} package and particle filter samplers are in the \code{nimbleSMC} package.
+#' Details of the MCMC sampling algorithms provided with the NIMBLE MCMC engine; HMC samplers are in the \code{nimbleHMC} package and particle filter samplers are in the \code{nimbleSMC} package. Additional details, including some recommendations for samplers that may perform better than the samplers that NIMBLE assigns by default are provided in Section 7.11 of the User Manual.
 #'
 #'
 #' @param model (uncompiled) model on which the MCMC is to be run

diff --git a/packages/nimble/R/RCfunction_core.R b/packages/nimble/R/RCfunction_core.R
@@ -263,14 +263,66 @@ nf_checkDSLcode_buildDerivs <- function(code, buildDerivs) {
                 if(isFALSE(buildDerivs) || !length(buildDerivs) || is.null(buildDerivs) ||
                     (is.character(buildDerivs) && !methodName %in% buildDerivs) ||
                    (is.list(buildDerivs) && !methodName %in% names(buildDerivs)))
-                    message("  [Note] Detected use of `nimDerivs` with a function or method, `", methodName, "`, for which `buildDerivs` has not been set. This nimbleFunction cannot be compiled.") 
+                    messageIfVerbose("  [Note] Detected use of `nimDerivs` with a function or method, `", methodName, "`, for which `buildDerivs` has not been set. This nimbleFunction cannot be compiled.") 
             }
 
         }
     }
     invisible(NULL)
 }
 
+nf_checkDSLcode_checkForCalc <- function(code) {
+    code <- body(code)
+    return(sum(all.names(code) == "calculate") != sum(all.vars(code)=="calculate"))
+}
+
+nf_checkDSLcode_checkDerivsOf <- function(code) {
+    code <- body(code)
+    derivsFound <- which(findDerivsCalls(code))
+    if(length(derivsFound)) {
+        derivsOf <- sapply(derivsFound, function(i)
+            return(code[[i]][[3]][[2]][[1]]))
+        return(as.character(derivsOf[sapply(derivsOf, is.name)]))
+    }
+    return(NULL)
+}
+
+findDerivsCalls <- function(code) {
+    ## This assumes `derivs()` call is from assignment like `var <- derivs()`.
+    sapply(code, function(expr)
+        length(expr) >= 3 && length(expr[[1]]) == 1 &&
+        as.character(expr[[1]]) %in% c("=", "<-", "<<-") &&
+        length(expr[[3]]) > 1 && length(expr[[3]][[1]]) == 1 && 
+        as.character(expr[[3]][[1]]) %in% c('derivs', 'nimDerivs'))
+}
+
+checkNestedCalcCall <- function(functionName, methodsWithCalc, methodsDerivsOf) {
+    if(functionName %in% methodsWithCalc) return(TRUE)
+    if(functionName %in% names(methodsDerivsOf)) 
+        return(any(sapply(methodsDerivsOf[[functionName]], checkNestedCalcCall,
+                          methodsWithCalc, methodsDerivsOf)))
+    return(FALSE)
+}
+
+nf_checkDSLcode_calcDerivsArgs <- function(code, methodsWithCalc, methodsDerivsOf) {
+    code <- body(code)
+    derivsFound <- which(findDerivsCalls(code))
+    for(idx in derivsFound) {
+        argNames <- names(code[[idx]][[3]])
+        call <- code[[idx]][[3]][[2]][[1]]
+        if(length(call) == 1 && checkNestedCalcCall(as.character(call), methodsWithCalc, methodsDerivsOf) &&
+           length(setdiff(c('model', 'constantNodes', 'updateNodes'), argNames))) 
+            messageIfVerbose("  [Warning] Detected use of `nimDerivs` on a function or method, `", code[[idx]][[3]][[2]][[1]], "`,\n",
+                             "            that appears to contain the use of `calculate` on a model.\n",
+                             "            If model calculations are done in the method being differentiated, the 'model'\n",
+                             "            argument to 'nimDerivs' should be included to ensure correct restoration of\n",
+                             "            values in the model, and the 'updateNodes' and 'constantNodes' arguments\n",
+                             "            should also be provided (see Section 16.7.2 of the User Manual).")
+    }
+    invisible(NULL)
+}
+
+
 nf_checkDSLcode <- function(code, methodNames, setupVarNames, args, where = NULL) {
     validCalls <- c(names(sizeCalls),
                     otherDSLcalls,

diff --git a/packages/nimble/R/nimbleFunction_Rderivs.R b/packages/nimble/R/nimbleFunction_Rderivs.R
@@ -25,16 +25,25 @@ nimDerivs_dummy <- nimbleFunction(
 #' @param order an integer vector with values within the set \eqn{{0, 1, 2}}, 
 #' corresponding to whether the function value, Jacobian, and Hessian should be
 #'  returned respectively.  Defaults to \code{c(0, 1, 2)}.
-#' @param model (optional) for derivatives of a nimbleFunction that involves model.
-#' calculations, the uncompiled model that is used. This is needed in order
+#' @param model (optional) the uncompiled model that is used, if taking derivatives
+#' of a nimbleFunction that involves model calculations. This is needed in order
 #' to be able to correctly restore values into the model when \code{order} does not
-#' include 0 (or in all cases when double-taping).
+#' include 0 (or in all cases when double-taping). IMPORTANT: if \code{model}
+#' is included, one should also include the arguments \code{updateNodes} and
+#' \code{constantNodes} using the output obtained from running
+#' \code{makeModelDerivsInfo}.
 #' @param ... additional arguments intended for internal use only.
 #'
 #'@details Derivatives for uncompiled nimbleFunctions are calculated using the
 #' \code{numDeriv} package.  If this package is not installed, an error will
 #' be issued.  Derivatives for matrix valued arguments will be returned in 
 #' column-major order.
+#'
+#' As discussed above with the \code{model} argument, if taking derivatives
+#' of a nimbleFunction that involves model calculations (rather than directly
+#' taking derivatives of `calculate`), care needs to be taken to provide
+#' \code{model}, \code{updateNodes}, and \code{calcNodes} arguments. See
+#' Section 16.7.2 of the User Manual for more details.
 #' 
 #' @return an \code{ADNimbleList} with elements \code{value}, \code{jacobian},
 #' and \code{hessian}.
@@ -694,15 +703,7 @@ nimDerivs_nf <- function(call = NA,
           if(e$restoreInfo$deepestDepth < e$restoreInfo$currentDepth)
               e$restoreInfo$deepestDepth <- e$restoreInfo$currentDepth
       }
-  } else { # partial check for whether there is a model in the nimbleFunction
-      if(is(derivFxn, 'refMethodDef') && is.nf(e$.self)) {
-          isModel <- sapply(names(e), function(x) is.model(e[[x]]))
-          if(any(isModel)) {
-              modelElement <- names(e)[which(isModel)]
-              warning("nimDerivs_nf: detected a model, ", paste(modelElement, collapse = ','), ", associated with the nimbleFunction whose method is being differentiated. If model calculations are done in the method being differentiated, the 'model' argument to 'nimDerivs' should be included to ensure correct restoration of values in the model.")
-          }
-      }
-  }
+  } 
 
   ## standardize the derivFxnCall arguments
   derivFxnCall <- match.call(derivFxn, derivFxnCall)