Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about estimated variance #45

Open
sgruber65 opened this issue Oct 11, 2021 · 1 comment
Open

question about estimated variance #45

sgruber65 opened this issue Oct 11, 2021 · 1 comment

Comments

@sgruber65
Copy link

I ran the first example in the R package on CRAN (v1.1.1) 1000 times, and kept track of the estimates and variances returned each time. The empirical variance of each treatment-specific mean was (.0037, .0040), respectively, while the mean of the variance esimates was (0.0029, 0.0012), (elements [1,1] and [2,2] in the covariance matrix). The first one is maybe not off by too much, but the second one is so wrong that I'm wondering if there is a bug. Can you please look into this, and let me know what you find out? Thanks.

--Susan

set.seed(1234)
n <- 200
niter <- 1000
est <- matrix(NA, nrow = niter, ncol = 4)
colnames(est) <- c("est.01", "est.11", "var.11", "var.22")
for (i in 1:niter){
  trt <- rbinom(n, 1, 0.5)
  adjustVars <- data.frame(W1 = round(runif(n)), W2 = round(runif(n, 0, 2)))

  ftime <- round(1 + runif(n, 1, 4) - trt + adjustVars$W1 + adjustVars$W2)
  ftype <- round(runif(n, 0, 1))

  # Fit 1
  # fit a survtmle object with glm estimators for treatment, censoring, and
  # failure using the "mean" method
  fit1 <- survtmle(ftime = ftime, ftype = ftype,
                 trt = trt, adjustVars = adjustVars,
                 glm.trt = "W1 + W2",
                 glm.ftime = "trt + W1 + W2",
                 glm.ctime = "trt + W1 + W2",
                 method = "mean", t0 = 6)
  est[i,] <- c(fit1$est[,1], fit1$var[1,1], fit1$var[2,2])
}
colMeans(est)
apply(est[,1:2], 2, var)
@benkeser
Copy link
Owner

Thanks for the report. I’ll look into it a bit more closely in the coming week. Off the top of my head it looks like it’s just a highly inconsiderate example. Not clear that the sequential outcome regressions nor censoring should necessarily approximate the truth. Accordingly it’s not clear that the se’s should be correct. I’m essentially just generating some data and running some code. Probably could use a better example. Darn you 7-years-ago David!

But let me look at it a bit more closely and get back to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants