2.6.0

Wenchao-Ma · Jun 12, 2019 · 93e58bc · 93e58bc
1 parent 07b60a8
commit 93e58bc
Show file tree

Hide file tree

Showing 9 changed files with 117 additions and 65 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,8 +1,8 @@
 Package: GDINA
 Type: Package
 Title: The Generalized DINA Model Framework
-Version: 2.5.2
-Date: 2019-5-10
+Version: 2.6.0
+Date: 2019-6-12
 Authors@R: c(person(given = "Wenchao",family = "Ma", role = c("aut", "cre", "cph"),email = "wenchao.ma@ua.edu"),person(given = "Jimmy", family = "de la Torre", role = c("aut","cph")), person(given="Miguel",family = "Sorrel",role = "ctb"), person(given="Zhehan",family = "Jiang",role = "ctb"))
 Description: A set of psychometric tools for cognitive diagnosis modeling based on the generalized deterministic inputs, noisy and gate (G-DINA) model by de la Torre (2011) <DOI:10.1007/s11336-011-9207-7> and its extensions, including the sequential G-DINA model by Ma and de la Torre (2016) <DOI:10.1111/bmsp.12070> for polytomous responses, and the polytomous G-DINA model by Chen and de la Torre <DOI:10.1177/0146621613479818> for polytomous attributes. Joint attribute distribution can be independent, saturated, higher-order, loglinear smoothed or structured. Q-matrix validation, item and model fit statistics, model comparison at test and item level and differential item functioning can also be conducted. A graphical user interface is also provided.
 License: GPL-3
@@ -62,7 +62,7 @@ Collate:
     'utils.R'
     'zzz.R'
 NeedsCompilation: yes
-Packaged: 2019-5-10 02:03:59 UTC; Wenchao
+Packaged: 2019-6-12 02:03:59 UTC; Wenchao
 Author: Wenchao Ma [aut, cre, cph], Jimmy de la Torre [aut, cph], Miguel Sorrel [ctb], Zhehan Jiang [ctb]
 Maintainer: Wenchao Ma <wenchao.ma@ua.edu>
 VignetteBuilder: knitr
diff --git a/NEWS.md b/NEWS.md
@@ -1,6 +1,10 @@
+# GDINA 2.6.0
+* Fixed   - a mistake in BIC calculation (since version 2.5.2)  
+* Fixed   - M2 calculation in `modelfit` was incorrect for LLM and RRUM
+* Changed - M2 in `modelfit` can be calculated for higher-order or independent models of joint attribute distribution
+
 # GDINA 2.5.2
 * Added   - `ICLA()` function for attribute profile estimation
-* Changed - data are grouped before single group GDINA analysis 
 * Changed - data are aggregated before model estimation for single group CDMs
 * Deleted - exported internal functions are removed
 * Fixed   - `print.GDINA`()` prints valid number of individuals by default now

diff --git a/R/M2.R b/R/M2.R
@@ -1,25 +1,26 @@
 #' Model fit statistics
 #'
-#' Calculate various model-data fit statistics
+#' Calculate various absolute model-data fit statistics
 #'
-#' Various model-data fit statistics including M2 statistic for G-DINA model with dichotmous responses (Liu, Tian, & Xin, 2016; Hansen, Cai, Monroe, & Li, 2016) and for sequential G-DINA model with graded responses (Ma, under review).
+#' Various model-data fit statistics including M2 statistic for G-DINA model with dichotmous responses (Liu, Tian, & Xin, 2016; Hansen, Cai, Monroe, & Li, 2016) and for sequential G-DINA model with graded responses (Ma, 2019).
 #' It also calculates SRMSR and RMSEA2.
 #'
 #' @param GDINA.obj An estimated model object of class \code{GDINA}
 #' @param CI numeric value from 0 to 1 indicating the range of the confidence interval for RMSEA. Default returns the 90\% interval.
-#' @param ... arguments passed to the function
+#' @param ItemOnly should joint attribute distribution parameters be considered? Default = TRUE. See Ma (2019).
 #'
 #' @author {Wenchao Ma, The University of Alabama, \email{wenchao.ma@@ua.edu}}
 #' @export
 #' @references
 #'
+#' Hansen, M., Cai, L.,  Monroe, S., & Li, Z. (2016). Limited-information goodness-of-fit testing of diagnostic classification item response models. \emph{British Journal of Mathematical and Statistical Psychology. 69,} 225--252.
+#'
+#' Liu, Y., Tian, W., & Xin, T. (2016). An Application of M2 Statistic to Evaluate the Fit of Cognitive Diagnostic Models. \emph{Journal of Educational and Behavioral Statistics, 41}, 3-26.
+#'
 #' Ma, W. (2019). Evaluating model data fit using limited information statistics for the sequential G-DINA model.\emph{Applied Psychological Measurement.}
 #'
 #' Maydeu-Olivares, A. (2013). Goodness-of-Fit Assessment of Item Response Theory Models. \emph{Measurement, 11}, 71-101.
 #'
-#' Hansen, M., Cai, L.,  Monroe, S., & Li, Z. (2016). Limited-information goodness-of-fit testing of diagnostic classification item response models. \emph{British Journal of Mathematical and Statistical Psychology. 69,} 225--252.
-#'
-#' Liu, Y., Tian, W., & Xin, T. (2016). An Application of M2 Statistic to Evaluate the Fit of Cognitive Diagnostic Models. \emph{Journal of Educational and Behavioral Statistics, 41}, 3-26.
 #' @examples
 #' \dontrun{
 #' dat <- sim10GDINA$simdat
@@ -29,24 +30,15 @@
 #'}
 
 
-modelfit <- function(GDINA.obj,CI = 0.90,...)
+modelfit <- function(GDINA.obj, CI = 0.90, ItemOnly = FALSE)
 {
 
   if(CI>=1||CI<=0) stop("CI must be between 0 and 1.",call. = FALSE)
 
-  ItemOnly <- dots("ItemOnly",FALSE,...)
-
   if(extract(GDINA.obj, "ngroup")!=1) {
     stop("modelfit is only applicable to single group analysis.", call. = FALSE)
   }
-  if(!is.null(extract(GDINA.obj,"att.str")))
-    stop("model-fit evaluation is not available for structured attributes.",call. = FALSE)
-  if (any(extract(GDINA.obj, "models_numeric") < 0) ||
-      any(extract(GDINA.obj, "models_numeric") > 6))
-    stop("modelfit is only applicable to GDINA, DINA, DINO, ACDM, LLM and RRUM.",
-         call. = FALSE)
-  if (extract(GDINA.obj, "att.dist") %in% c("higher.order","independent","fixed"))
-    stop(paste("modelfit is not available for ",extract(GDINA.obj, "att.dist"),"joint attribute distribution."),call. = FALSE)
+
 
   delta <- extract(GDINA.obj, "delta.parm")
   Q <- extract(GDINA.obj, "Q")
@@ -61,7 +53,8 @@ modelfit <- function(GDINA.obj,CI = 0.90,...)
   nitems <- ncol(dat)   # number of items
   ncat <- extract(GDINA.obj, "ncat")
   K <- extract(GDINA.obj, "natt")
-  L <- 2^K
+  att <- as.matrix(extract(GDINA.obj,"attributepattern"))
+  L <- extract(GDINA.obj, "nLC")
   Kj <- extract(GDINA.obj, "Kj")
   nparJ <- npar(GDINA.obj)$`No. of total item parameters`
   models <- extract(GDINA.obj, "models")
@@ -70,23 +63,19 @@ modelfit <- function(GDINA.obj,CI = 0.90,...)
   pf <- extract(GDINA.obj, "LCpf.parm") # S(Xi=s|alpha) S x L without category 0
 
   # Observed p
-  crossp <-
-    crossprod.na(dat, dat, val = 0) / crossprod(!is.na(dat), !is.na(dat))
+  crossp <- crossprod.na(dat, dat, val = 0) / crossprod(!is.na(dat), !is.na(dat))
   # univariate and bivariate observed
-  p <-
-    c(colMeans(dat, na.rm = TRUE), crossp[lower.tri(crossp)]) # length of nitems + nitems*(nitems-1)/2
+  # length of nitems + nitems*(nitems-1)/2
+  p <- c(colMeans(dat, na.rm = TRUE), crossp[lower.tri(crossp)])
 
   Xi <- Mord(item.no, as.matrix(pj), post)
   Xi2 <- cbind(rbind(Xi$Xi11, Xi$Xi21), rbind(t(Xi$Xi21), Xi$Xi22))
 
-  e <-
-    c(Xi$uni, Xi$bi[lower.tri(Xi$bi)])   # length of nitems + nitems*(nitems-1)/2
+  e <- c(Xi$uni, Xi$bi[lower.tri(Xi$bi)])   # length of nitems + nitems*(nitems-1)/2
   se <- sqrt(diag(Xi$bi) - c(Xi$uni) ^ 2)
-  difr <-
-    cor(dat, use = "pairwise.complete.obs") - (Xi$bi - Xi$uni %*% t(Xi$uni)) /
+  difr <- cor(dat, use = "pairwise.complete.obs") - (Xi$bi - Xi$uni %*% t(Xi$uni)) /
     (se %*% t(se))
-  SRMSR <-
-    sqrt(sum((difr[lower.tri(difr)]) ^ 2 / (nitems * (nitems - 1) / 2)))
+  SRMSR <- sqrt(sum((difr[lower.tri(difr)]) ^ 2 / (nitems * (nitems - 1) / 2)))
   M2 <- sig <- df <- rmsea <- ci <- NULL
   if (ItemOnly &
       nitems * (nitems + 1) / 2 - extract(GDINA.obj, "npar.item") < 0) {
@@ -98,11 +87,10 @@ modelfit <- function(GDINA.obj,CI = 0.90,...)
             call. = FALSE)
   } else{
     # parameter locations
-    patt <- eta(as.matrix(Q))
-    Mj <- list()
-    for (s in 1:ncat)
-      Mj[[s]] <- designmatrix(Kj[s], models[s])
 
+    patt <- extract(GDINA.obj,"eta")
+    Mj <- extract(GDINA.obj,"designmatrix")
+    linkf <- extract(GDINA.obj,"linkfunc")
     parloc <- matrix(c(cumsum(c(
       1, unlist(lapply(Mj, ncol))[-ncat]
     )),
@@ -112,16 +100,17 @@ modelfit <- function(GDINA.obj,CI = 0.90,...)
     # partial s/ partial d
     extMj <- vector("list", ncat)
     for (j in 1:ncat) {
-      extMj[[j]] <- Mj[[j]][patt[j, ],]
-      if (models[j] == "LLM") {
-        extMj[[j]] <- pj[j, ] * (1 - pj[j, ]) * extMj[[j]]
-      } else if (models[j] == "RRUM") {
-        extMj[[j]] <- pj[j, ] * extMj[[j]]
+      if(linkf[j]=="identity"){
+        extMj[[j]] <- Mj[[j]][patt[j, ],]
+      }else if (linkf[j]=="logit") {
+        extMj[[j]] <- pj[j, ] * (1 - pj[j, ]) * Mj[[j]][patt[j, ],]
+      } else if (linkf[j]=="log") {
+        extMj[[j]] <- pj[j, ] * Mj[[j]][patt[j, ],]
       }
 
     }
     # seq_component is partial E/partial s
-    seq_component <- matrix(0, nrow(Qc), L)
+    seq_component <- matrix(0, ncat, L)
     expected.score <- matrix(0, nitems, L)
     for (j in 1:nitems) {
       locj <- which(item.no == j)
@@ -147,8 +136,6 @@ modelfit <- function(GDINA.obj,CI = 0.90,...)
 
     delta22E <- matrix(0, nitems * (nitems - 1) / 2, L)
     loc <- 1
-    pcpl <- rbind(diag(L - 1), -1)
-
     for (j in 1:nitems) {
       locj <- which(item.no == j)
       for (sj in locj) {
@@ -173,19 +160,77 @@ modelfit <- function(GDINA.obj,CI = 0.90,...)
       }
     }
 
-    if (ItemOnly) {
+    if (ItemOnly || extract(GDINA.obj, "att.dist")=="fixed") {
       delt <- rbind(delta11, delta21)
     } else{
       if (extract(GDINA.obj, "att.dist") == "saturated") {
-        delta12 <- expected.score %*% rbind(diag(L - 1), -1)
-        delta22 <- delta22E %*% rbind(diag(L - 1), -1)
-      } else if (extract(GDINA.obj, "att.dist") == "loglinear") {
+        pcpl <- rbind(diag(L - 1), -1)
+        delta12 <- expected.score %*% pcpl
+        delta22 <- delta22E %*% pcpl
+      } else if (extract(GDINA.obj, "att.dist") == "loglinear"){
         Z <- designM(K, 0)
         loglinear <- extract(GDINA.obj, "loglinear")
         Z <- Z[, seq_len(1 + sum(sapply(seq_len(extract(GDINA.obj,"loglinear")),choose,n=K)))]
 
         delta12 <- expected.score %*% (post * Z)
         delta22 <- delta22E %*% (post * Z)
+      }else if(extract(GDINA.obj, "att.dist") == "higher.order"){
+
+        higher.order <- extract(GDINA.obj,"higher.order")
+        HOpar <- extract(GDINA.obj,"struc.parm")
+
+        P.att.theta <- exp(logLikPattern(att,
+                      higher.order$QuadNodes,
+                      HOpar[,1],HOpar[,2]) +
+              matrix(1,nrow(att),1)%*%t(log(higher.order$QuadWghts))) # 2^K x nnodes
+
+        Pk_theta <- Pr_2PL_vec(higher.order$QuadNodes, HOpar[,1],HOpar[,2]) #P(\alpha_k|theta) nnodes x K
+        if(higher.order$model=="Rasch"){
+          delta12 <- matrix(0,nitems,K)
+          delta22 <- matrix(0,nrow(delta22E),K)
+          for(k in seq_len(nrow(HOpar))){
+
+            # partial E/partial intercept
+            delta12[,k] <- rowSums(expected.score%*%(P.att.theta * outer(att[,k],Pk_theta[,k],"-")))
+            delta22[,k] <- rowSums(delta22E%*%(P.att.theta * outer(att[,k],Pk_theta[,k],"-")))
+
+          }
+        }else if(higher.order$model=="2PL"){
+          delta12 <- matrix(0,nitems,2*K)
+          delta22 <- matrix(0,nrow(delta22E),2*K)
+          for(k in seq_len(nrow(HOpar))){
+
+            # partial E/partial intercept
+            delta12[,k] <- rowSums(expected.score%*%(P.att.theta * outer(att[,k],Pk_theta[,k],"-")))
+            delta22[,k] <- rowSums(delta22E%*%(P.att.theta * outer(att[,k],Pk_theta[,k],"-")))
+            # partial E/partial slope <= 2PL
+            delta12[,k+K] <- c(expected.score%*%(P.att.theta * outer(att[,k],Pk_theta[,k],"-"))%*%higher.order$QuadNodes)
+            delta22[,k+K] <- c(delta22E%*%(P.att.theta * outer(att[,k],Pk_theta[,k],"-"))%*%higher.order$QuadNodes)
+
+          }
+        }else if(higher.order$model=="1PL"){
+          delta12 <- matrix(0,nitems,1+K)
+          delta22 <- matrix(0,nrow(delta22E),1+K)
+          for(k in seq_len(nrow(HOpar))){
+
+            # partial E/partial intercept
+            delta12[,k] <- rowSums(expected.score%*%(P.att.theta * outer(att[,k],Pk_theta[,k],"-")))
+            delta22[,k] <- rowSums(delta22E%*%(P.att.theta * outer(att[,k],Pk_theta[,k],"-")))
+
+          }
+          # partial E/partial slope <= 2PL
+          delta12[,1+K] <- c(expected.score%*%(P.att.theta * outer(rowSums(att),rowSums(Pk_theta),"-"))%*%higher.order$QuadNodes)
+          delta22[,1+K] <- c(delta22E%*%(P.att.theta * outer(rowSums(att),rowSums(Pk_theta),"-"))%*%higher.order$QuadNodes)
+
+
+        }
+      }else if(extract(GDINA.obj, "att.dist") == "independent"){
+        pr <- extract(GDINA.obj,"struc.parm")
+        pr[pr < .Machine$double.eps] <- .Machine$double.eps
+        pr[pr > 1 - .Machine$double.eps] <- 1 - .Machine$double.eps
+        mp <- matrix(pr,nrow(att),ncol(att),byrow = TRUE)
+        delta12 <- expected.score %*% diag(post) %*% ( (att - mp) / (mp * (1-mp)) )
+        delta22 <- delta22E %*% diag(post) %*% ( (att - mp) / (mp * (1-mp)) )
       }
       delt <- cbind(rbind(delta11, delta21), rbind(delta12, delta22))
     }

diff --git a/R/SingleGroup_Estimation.R b/R/SingleGroup_Estimation.R
@@ -511,7 +511,7 @@ SG.Est <- function(dat, Q, weight=NULL, model, sequential,att.dist, att.prior, s
   list(catprob.parm = item.prob, delta.parm = delta, catprob.matrix = item.parm,
        struc.parm = lambda, model = model2character(model), LC.prob = LC.Prob,
        posterior.prob = postP, pf = pf, attributepattern = AlphaPattern,
-       testfit = list(Deviance=neg2LL,npar = npar,item.npar = free.item.npar, AIC=2 * npar + neg2LL, BIC=neg2LL + npar * log(N)),
+       testfit = list(Deviance=neg2LL,npar = npar,item.npar = free.item.npar, AIC=2 * npar + neg2LL, BIC=neg2LL + npar * log(length(raw2unique))),
        technicals = list(logposterior.i = estep$logpost[raw2unique, ], loglikelihood.i = estep$loglik[raw2unique, ],
                          free.item.npar = free.item.npar,
                          total.item.npar = total.item.npar, stru.npar = stru.npar, total.npar = npar,

diff --git a/R/extract.R b/R/extract.R
@@ -323,6 +323,7 @@ extract.GDINA <- function(object,what,SE.type = 2,...){
                 nitem = ncol(object$options$dat),
                 nitr = object$options$itr,
                 nobs = nrow(object$options$dat),
+                nLC = dim(object$attributepattern)[1],
                 pf = object$pf,
                 posterior.prob = object$posterior.prob,
                 prevalence = {

diff --git a/README.Rmd b/README.Rmd
@@ -28,7 +28,7 @@ Visit the package website https://wenchao-ma.github.io/GDINA for examples, tutor
 
 ## Learning resources
 
-* Watch [Ma and de la Torre's (2019) NCME digital module 5](https://ncme.elevate.commpartners.com/) on a gental introduction to the G-DINA model framework and the use of graphical user interface for CDM analyses
+* Watch [Ma and de la Torre's (2019) NCME digital module 5](https://ncme.elevate.commpartners.com/) on a gentle introduction to the G-DINA model framework and the use of graphical user interface for CDM analyses
 
 * Check [de la Torre and Akbay's (2019) article](https://doi.org/10.14689/ejer.2019.80.9) on how to conduct various CDM analyses using the graphical user interface
 
@@ -54,6 +54,8 @@ Visit the package website https://wenchao-ma.github.io/GDINA for examples, tutor
 
 ## Installation
 
+The stable version of GDINA should be installed from R CRAN at [here](https://CRAN.R-project.org/package=GDINA)
+
 To install this package from source:
 
 1) Windows users may need to install the
@@ -70,7 +72,7 @@ To install this package from source:
 devtools::install_github("Wenchao-Ma/GDINA")
 ```
 
-The stable version of GDINA should be installed from R CRAN at [here](https://CRAN.R-project.org/package=GDINA)
+
 
 
 
diff --git a/README.md b/README.md
@@ -24,7 +24,7 @@ examples, tutorials and more information.
 ## Learning resources
 
   - Watch [Ma and de la Torre’s (2019) NCME digital
-    module 5](https://ncme.elevate.commpartners.com/) on a gental
+    module 5](https://ncme.elevate.commpartners.com/) on a gentle
     introduction to the G-DINA model framework and the use of graphical
     user interface for CDM analyses
 
@@ -62,6 +62,9 @@ examples, tutorials and more information.
 
 ## Installation
 
+The stable version of GDINA should be installed from R CRAN at
+[here](https://CRAN.R-project.org/package=GDINA)
+
 To install this package from source:
 
 1)  Windows users may need to install the
@@ -83,6 +86,3 @@ To install this package from source:
 # install.packages("devtools")
 devtools::install_github("Wenchao-Ma/GDINA")
 ```
-
-The stable version of GDINA should be installed from R CRAN at
-[here](https://CRAN.R-project.org/package=GDINA)
diff --git a/inst/CITATION b/inst/CITATION
@@ -4,12 +4,12 @@ citEntry(entry = "Manual",
   title        = "{GDINA: The generalized DINA model framework}",
   author       = as.person("Ma, W. and de la Torre, J."),
   year         = "2019",
-  note         = "R package version 2.5",
+  note         = "R package version 2.6",
   url          = "https://CRAN.R-project.org/package=GDINA",
 
   textVersion  =
   paste("Ma, W. & de la Torre, J. (2019).",
         "GDINA: The generalized DINA model framework.",
-        "R package version 2.5. Retrived from https://CRAN.R-project.org/package=GDINA")
+        "R package version 2.6. Retrived from https://CRAN.R-project.org/package=GDINA")
 )
 
diff --git a/man/modelfit.Rd b/man/modelfit.Rd