Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in plotting functions plot.CBPS and plot.CBPSContinuous when using "covars" argument #21

Open
erp31 opened this issue Mar 17, 2021 · 5 comments

Comments

@erp31
Copy link

erp31 commented Mar 17, 2021

Hi,

Firstly, thanks for your work on this package.

I have found some minor bugs in the plotting functionality:

  1. plot.CBPS fails if "covars" is not in the form 1:x where x is a number less than the number of covariates e.g 3:4 or c(1, 2, 4) would not work.

I think the reason is that these lines select according to "covars":
balanced.std.mean <- bal.x[["balanced"]][covars, ]
original.std.mean <- bal.x[["original"]][covars, ]
but then later (in the for loop) the code tries to select the rows "covars" from balanced.std.mean and original.std.mean which causes an error. So selecting the rows according to "covars" should only happen once.

  1. plot.CBPSContinuous fails when "silent = FALSE" is used in combination with "covars".
    The bug here is in the code that returns the output. I think it should be
    data.frame(covariate = rownames(bal.x[["balanced"]])[covars], balanced = balanced.abs.cor, original = original.abs.cor)) instead of
    data.frame(covariate = rownames(bal.x[["balanced"]]), balanced = balanced.abs.cor, original = original.abs.cor)

I also noticed that the documentation for CBMSM doesn't state that the data need to be sorted by "time" rather than "id". My data were sorted by "id" and it took me a long time to work out why I was getting weights estimates of NA, so it would be really helpful if this could be added to the documentation.

I am using version 0.21 of CBPS on R 3.6.3.

Thanks!

@kosukeimai
Copy link
Owner

Thanks for this. Would you mind making a pull request? Also, the cobalt package works nicely with CBPS: see https://cran.r-project.org/web/packages/cobalt/vignettes/cobalt_A1_other_packages.html#using-bal.tab-with-cbps

@christianfong
Copy link
Collaborator

Thank you for flagging these. I've pushed an update that fixed the plotting issues and will get that on CRAN later this week. Can you please post code so that I can replicate the CBMSM bug you've run into? I didn't code that one myself, so I want to make sure I understand what's going on so that I can fix the documentation.

@erp31
Copy link
Author

erp31 commented Mar 22, 2021

Hi Christian,

Thanks for fixing the plotting issues faster than I could get round to making a pull request!

In the code below I've used the example for CBMSM then sorted the data by "id" and refit the model. If you inspect the "weights" element of the output you'll see that they're different, with the second having many NAs.

# Example from CBMSM docs
data(Blackwell)

form0 <- "d.gone.neg ~ d.gone.neg.l1 + camp.length"
fit0 <- CBMSM(formula = form0, time = Blackwell$time, id = Blackwell$demName,
            data = Blackwell, type = "MSM",  iterations = NULL, twostep = TRUE, 
            msm.variance = "approx", time.vary = FALSE)

any(is.na(fit0$weights)) # this is FALSE as expected

# Now reorder the Blackwell data by id column instead of time
Blackwell2 <- Blackwell[order(Blackwell$demName), ]
fit1 <- CBMSM(formula = form0, time = Blackwell2$time, id = Blackwell2$demName,
            data = Blackwell2, type = "MSM",  iterations = NULL, twostep = TRUE, 
            msm.variance = "approx", time.vary = FALSE)

any(is.na(fit1$weights)) # this is TRUE

@christianfong
Copy link
Collaborator

Thanks, I've replicated your error, and that seems like bad behavior to me. I will consider a documentation fix as a last resort, but I'm trying to figure out what's causing it. Will probably take a little while, since rtools is also giving me crap after the 4.0 update, but I will do my best to figure this out.

@christianfong
Copy link
Collaborator

I wasn't able to figure out what in CBMSM is causing the function to rely on observations being sorted by time, so I have simply added to the documentation that the function expects data to be sorted by time, as you have suggested. Thank you so much for flagging these issues for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants