irt_gen function #40

Sinan-Yavuz · 2021-02-04T22:05:07Z

Hello,

Thank you very much for creating "lsasim".

I have found a bug and want to share it with you.

irt_gen function doesn't produce a 0-1 matrix based on the parameters. I have changed the following to produce a proper dataset.

irt_gen2 <- function (theta, a_par = 1, b_par, c_par = 0, D = 1)
{
  response_pr <- c_par + (1 - c_par) * (exp(a_par*(theta - b_par)) / (1 + exp(a_par*(theta - b_par))))
  y <- rbinom(1, size = 1, prob = response_pr)
  return(y)
}

Let me know if you need me to provide you some examples, why irt_gen doesn't work.

wleoncio · 2021-02-05T06:33:43Z

Hi @Sinan-Yavuz, thank you for your contribution! We're gonna take a look at the issue and implement the necessary changes ASAP.

wleoncio · 2021-02-05T06:36:51Z

In the meantime, it'd be great if you could provide us with some examples why the current implementation of the function doesn't work, so we can add them to our unit test battery.

Sinan-Yavuz · 2021-02-05T14:49:38Z

Hi @wleoncio, thank you very much for your quick reply.

Here is the example;


response_gen2 <- function (subject, item, theta, a_par = NULL, b_par, c_par = NULL, 
                           d_par = NULL, item_no = NULL, ogive = "Logistic") 
{
  if (length(subject) != length(item)) 
    stop("subject and item vectors must be equal length.", 
         call. = FALSE)
  if (is.null(a_par)) 
    a_par <- rep(1, length(unique(item)))
  if (is.null(c_par)) 
    c_par <- rep(0, length(unique(item)))
  if (ogive == "Logistic") 
    DD <- 1
  if (ogive == "Normal") 
    DD <- 1.7
  if (is.null(item_no)) 
    item_no <- seq(length(unique(item)))
  if (is.null(d_par)) 
    b_pars <- split(b_par, seq(length(b_par)))
  if (!is.null(d_par)) {
    d_mat <- do.call("cbind", d_par)
    b_pars <- list()
    for (i in 1:length(b_par)) {
      if (sum(abs(d_mat[i, ])) != 0) {
        b_list <- list()
        for (j in 1:length(d_mat[i, ])) b_list[[j]] <- b_par[i] + 
            d_mat[i, j]
        b_pars[[i]] <- unlist(b_list)
      }
      if (sum(abs(d_mat[i, ])) == 0) 
        b_pars[[i]] <- b_par[i]
    }
  }
  names(b_pars) <- item_no
  
  y <- numeric(length(subject))
  for (n in 1:length(subject)) {
    y[n] <- irt_gen2(theta = theta[subject[n]], a_par = a_par[which(item_no == 
                                                                      item[n])], b_par = b_pars[[which(item_no == item[n])]], 
                     c_par = c_par[which(item_no == item[n])], D = DD)
  }
  df_l <- data.frame(item = item, subject = subject, response = y)
  df_w <- reshape(df_l, timevar = "item", idvar = "subject", 
                  direction = "wide")
  df_item_old <- colnames(df_w)[2:length(df_w)]
  df_item_num <- gsub("[^[:digit:]]", "", df_item_old)
  df_item_new <- ifelse(nchar(df_item_num) == 1, paste0("i00", 
                                                        df_item_num), ifelse(nchar(df_item_num) == 2, paste0("i0", 
                                                                                                             df_item_num), paste0("i", df_item_num)))
  colnames(df_w)[2:length(df_w)] <- df_item_new
  df_w <- df_w[, order(names(df_w))]
  rownames(df_w) <- NULL
  return(y = df_w)
}



irt_gen2 <- function (theta, a_par = 1, b_par, c_par = 0, D = 1)
{
  response_pr <- c_par + (1 - c_par) * (exp(a_par*(theta - b_par)) / (1 + exp(a_par*(theta - b_par))))
  y <- rbinom(1, size = 1, prob = response_pr)
  return(y)
}


set.seed(345)

#this is a normal example
subject <- 1:100
theta <- rnorm(100,0,1)
student_dt <- data.frame(subject, theta)


item <- as.integer(c(1:n_items))
a <- runif(n_items, 0.5, 1.5)
b <- runif(n_items, -3, 3)
c <- runif(n_items, 0, .15)
item_par <- data.frame(item, a, b, c)

resp_matrix <- response_gen(subject = sort(rep(subject, 30)), item = rep(item,100), theta = theta, a_par = a, b_par = b, c_par = c)

#item means make sense and everything looks fine
colMeans(resp_matrix[item])

lm(colMeans(resp_matrix[item]) ~ b)


#Let's make it very extreme, student mean theta is -20, I don't expect them to solve anything. But guessing parameters are super high. So, regardles if their theta 80%-90% of them should solve these items.
subject <- 1:100
theta <- rnorm(100, -20 ,1)
student_dt <- data.frame(subject, theta)


item <- as.integer(c(1:n_items))
a <- runif(n_items, 0.5, 1.5)
b <- runif(n_items, -3, 3)

#chance is super high
c <- runif(n_items, 0.8, .9)
item_par <- data.frame(item, a, b, c)

resp_matrix <- response_gen(subject = sort(rep(subject, 30)), item = rep(item,100), theta = theta, a_par = a, b_par = b, c_par = c)

colMeans(resp_matrix[item]) #look at the results, they are around 50%, what happened to .8-.9 guessing parameter
lm(colMeans(resp_matrix[item]) ~ b)


#let's check after the fix - but it only works for dichotomous variables
resp_matrix2 <- response_gen2(subject = sort(rep(subject, 30)), item = rep(item,100), theta = theta, a_par = a, b_par = b, c_par = c)
colMeans(resp_matrix2[item]) #look at the results, they are around 50%, what happened to .8-.9 guessing parameter
lm(colMeans(resp_matrix2[item]) ~ b)

pdbailey0 · 2021-02-05T18:33:57Z

Hi, I worked to identify this bug with @Sinan-Yavuz and the way you can tell this is a bug is that the intercept of the lm should move up as the guessing parameter increases.

Your formula works so long as the c parameter is 0. When it is not zero you make the wrong probability of a zero

The problem is that the binary is being treated like the polytomous case. In the binary case there is a guessing parameter and you do not want to divide by the denominator, the second element of the numerator is already the probability of a 1. c_par decreases, not increases the probability of a zero, but your formula increases it.

mollyolaf · 2021-02-05T19:38:41Z

Thanks, Paul and Sinan - that's super helpful. We'll sort this out. Glad you are using lsasim!

wleoncio · 2021-02-08T09:01:04Z

Checking the MRE

OK, so I've created a markdown document based on the code above. Please check it here and let me know if the output conforms to your expectations.

Fixing the bug

As for implementing the changes, I am not sure how to proceed. The irt_gen2() function proposed in the OP doesn't break any of the current unit tests, but I understand it only works for dichotomous variables, so I am not sure if it can simply replace irt_gen() and call it a day. I'd also like to understand a bit better the differences between lsasim::response_gen() and the response_gen2() function from your MRE.

pdbailey0 · 2021-02-08T12:59:59Z

I think you may want to break up dichotomous from polytomous cases. I've never seen a polytomous case with a c parameter.

I think you could add this line right before you sample

  response_pr[1] <- ifelse(c_par > 0, 1 - response_pr[2], response_pr[1])

so irt_get would be:

irt_gen <- function(theta, a_par = 1, b_par, c_par = 0, D = 1) {
  unsummed <- c(0, D * a_par * (theta - b_par))
  num <- exp(cumsum(unsummed))
  den <- sum(num)
  response_pr <- c_par + (1 - c_par) * (num / den)
  response_pr[1] <- ifelse(c_par > 0, 1 - response_pr[2], response_pr[1])
  y <- sample(1:length(response_pr) - 1, size = 1, prob = response_pr)
  return(y)
}

but I haven't tested this and I'm still not entirely sure what is vectorized here.

Another workaround would be to use Sinan's code when c_par > 0 and your existing code otherwise.

wleoncio · 2021-02-09T06:32:13Z

Good ideas, @pdbailey0, we'll work on that ASAP. 👍🏽

Source: #40 (comment)

wleoncio · 2021-06-22T13:42:37Z

Changes implemented and will be included on the next CRAN release of the package. @Sinan-Yavuz and @pdbailey0, thank you once again for your contribution; we have added you as package contributors, please let us know if the information below needs any modification (e.g. typo or addition of a field such as e-mail):

lsasim/DESCRIPTION

Lines 17 to 18 in 97934b4

    
           person("Sinan", "Yavuz", role = "ctb"), 
        
           person("Paul", "Bailey", role = "ctb")

Sinan-Yavuz · 2021-06-24T13:24:28Z

@wleoncio, I am honored to be mentioned as a contributor though it is not necessary. I will make a modification to my name and email. At some point, if you change your mind and think my contribution is no longer valid or important, feel free to remove me.

wleoncio · 2021-06-25T06:25:24Z

Looking forward to the Pull Request containing the modifications, @Sinan-Yavuz!

Your and Paul's mentions were based on the documentation for the person() function (see extract below) and your authorship of the code on commits 5f3396f and dfb1450.

‘"ctb"’ (Contributor) Use for authors who have made smaller
contributions (such as code patches etc.) but should not show
up in the package citation.

By "package citation", they mean the contents of citation("lsasim").

wleoncio added the bug label Feb 5, 2021

wleoncio self-assigned this Feb 8, 2021

wleoncio mentioned this issue May 3, 2021

Generating multinomial background variables #41

Open

wleoncio pushed a commit that referenced this issue Jun 22, 2021

Added MRE about issue #40

5f3396f

Source: #40 (comment)

wleoncio pushed a commit that referenced this issue Jun 22, 2021

Added Paul's fix to #40

dfb1450

wleoncio added a commit that referenced this issue Jun 22, 2021

Properly-formatted #40 unit tests

48a0bd9

wleoncio added a commit that referenced this issue Jun 22, 2021

Added Sinan and Paul as contributors (closes #40)

97934b4

wleoncio closed this as completed Jun 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

irt_gen function #40

irt_gen function #40

Sinan-Yavuz commented Feb 4, 2021 •

edited by wleoncio

Loading

wleoncio commented Feb 5, 2021 •

edited

Loading

wleoncio commented Feb 5, 2021

Sinan-Yavuz commented Feb 5, 2021 •

edited

Loading

pdbailey0 commented Feb 5, 2021

mollyolaf commented Feb 5, 2021

wleoncio commented Feb 8, 2021

pdbailey0 commented Feb 8, 2021

wleoncio commented Feb 9, 2021

wleoncio commented Jun 22, 2021

Sinan-Yavuz commented Jun 24, 2021

wleoncio commented Jun 25, 2021

irt_gen function #40

irt_gen function #40

Comments

Sinan-Yavuz commented Feb 4, 2021 • edited by wleoncio Loading

wleoncio commented Feb 5, 2021 • edited Loading

wleoncio commented Feb 5, 2021

Sinan-Yavuz commented Feb 5, 2021 • edited Loading

pdbailey0 commented Feb 5, 2021

mollyolaf commented Feb 5, 2021

wleoncio commented Feb 8, 2021

Checking the MRE

Fixing the bug

pdbailey0 commented Feb 8, 2021

wleoncio commented Feb 9, 2021

wleoncio commented Jun 22, 2021

Sinan-Yavuz commented Jun 24, 2021

wleoncio commented Jun 25, 2021

Sinan-Yavuz commented Feb 4, 2021 •

edited by wleoncio

Loading

wleoncio commented Feb 5, 2021 •

edited

Loading

Sinan-Yavuz commented Feb 5, 2021 •

edited

Loading