New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Puzzling rbind() behaviour #98
Comments
Hi Stefano, Thank you so much for pointing out this bug. It was easy to fix, and I have fixed it in the development version of the package. I'll be uploading it to CRAN shortly. I really appreciate you letting me know about this! Noah |
Hi Noah, Sincere thanks for quickly confirming and addressing the bug nature of the rbind.matchdata() feature I wrote to you about. Unfortunately, given the sensitive nature of the (individual-level health-care) data I work with, I won't have access to your upgraded package as the R environment I work with is off-line. In the meantime I had figured that all I needed to do was to only retain rows in the stacked matched data-frame corresponding to values of the 'subclass' factor featuring a "_" (underscore); is this correct? All the best, -- |
That should work. You can also manually do what Change the levels of |
Thank you for your confirmation, Noah; much appreciated. Indeed your proposed approach is formally equivalent to the one I've outlined and am adopting. You'll have noticed I sent you a separate query on your Gmail account, as it wasn't related to this (or any other) bug. Do by all means take your time to address it, if at all: I don't mean to abuse of your availability. All the best, -- |
Dear Noah,
Many thanks for packaging together so many matching-related facilities in your excellent MatchIt R library. I've recently started using it due to its versatility, but have stumbled in what I suspect may be a bug in the rbind() function.
I've been applying separate Optimal Matching routines to non-overlapping subsets of a data-set identified according to a 'stratum' factor, with the intention of then binding derived matched sub-samples into a full matched data-set via rbind(). However I've been noticing that, by doing so, I end up with a fully matched data-set that has twice the number of rows I'd expect. For instance, running the example around the LaLonde data-set detailed in the on-line documentation to rbind.matchdata() I obtain nrow(md_b) = 174, nrow(md_h) = 22 and nrow(md_w) = 36 but nrow(md_all) = 464 = 2 * (174 + 22 + 36), instead of the expected 232 = 174 + 22 + 36.
I understand from the rbind.matchdata() documentation that the function's main purpose is to disambiguate the 'subclass' factor levels when stacking the matched sub-samples; is however the duplication of rows an intended consequence? If so, it remains unclear to me how I can then carry out diagnostic checks (like those based on e.g. SMDs or VRs) on the re-combined matched data-set. Am I missing something here? Of note, I'm running MatchIt 4.3.2 on R 4.0.3 on Windows.
Thank you in advance for your help and time!
--
Stefano
The text was updated successfully, but these errors were encountered: