-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better treatment indicator handling, NA values in match_on.formula
#124
Comments
I believe my conjecture was right. Including a second treated unit is sufficient to get a valid match. Then the workaround gets the right answer:
|
Interesting. Thanks to Colin Hubbard for the tip. I did a recover() on the example, here's what I'm seeing inside of
This seemed to me to suggests the error may be somewhere different from where @markmfredrickson was thinking. (But then Mark's last post popped up 15 seconds ago, so maybe I'm just not following...) |
Edit: Oops just looked at Mark's commit which already solved it! |
@josherrickson is correct. See 4955bb9 for my changes to that line of code. |
@markmfredrickson Might it not be better to enforce that numeric treatment vectors be 0/1? Either that or make it clear in the documentation that with other values we assume the lower is control. |
Interesting point by @josherrickson . If Colin Hubbard is watching, I'd appreciate hearing about his usage scenario. Ordinarily I'd think that if some level of dose other than 0 were to serve as a control condition it would be best to have the function insist that that level be explicitly specified. Otherwise you might find that adding or removing a single observation totally changes the matching structure in unexpected ways, if the addition or removal of that obs resulted in a change to the lowest observed value of the treatment variable. Perhaps I'm missing another usage scenario. Absent new info to that effect, I'm inclined to think Josh has a good point. |
I agree with @josherrickson, with the amendments that:
|
Not clearly germane to the issue at hand, but if anyone understood why this errored out
but this did not --
-- I'd be interested to know. (Also your thoughts about fixability, if any.) |
I don't see that error.
|
@benthestatistician Are you implying that |
Weird, I'm certainly getting the error.
@josherrickson are you using optmatch 0.9-7 or development optmatch? Maybe the problem already got fixed. (If so, good to note so that we don't re-break it.) Getting back to the main issue, I think that |
I'm going to push back a bit on this. My primary view is that we should accept two (and only two) versions of the treatment vector: 0/1/NA and TRUE/FALSE/NA. (I'm willing to concede accepting "0"/"1"/"NA" as well, though I think we should not). My rationale is that by only accepting a very strict set of treatment vectors, we force users to be explicit in their handling of treatment and hopefully avoid surprises. While [1] The most common example I see of this is from survey data, where 1 = "no", 2 = "yes", 3 = "did not respond". If a client does the common trick of |
Excellent points, @josherrickson . I'm persuaded. In effect, this is an API change, so let's be sure to provide those users who'll all of a sudden be getting new errors with a clear trail of crumbs toward a solution, |
It occurred to me that we should update Not a heavy lift, I think... |
I agree. Having trouble compiling right now so I can't address it at the moment. Something like this should work:
I will take a look when I figure out why my computer is refusing to compile optmatch (I think it is an Rcpp issue:
but I'm not sure. I feel like everytime I try and build a package these days, Rcpp modifies R/RCppExports.R and src/RcppExports.cpp without my say-so.), but feel free to make the changes sooner if you're trying to get a new version pushed out. |
I agree on each point, @josherrickson . Re compile problems, I think @markmfredrickson reported having similar trouble, then after some effort pushing through it. Mark, is this correct? Can you share some wisdom? |
My laptop still builds fine. I'll have this updated shortly. |
Updated in b4908b3. |
This has been completed I believe. |
Thanks @josherrickson! |
Colin Hubbard sent in a bug report based on the following:
At least as a work around, it would make sense to be able to set up an indicator variable, but that fails with NA values in the distances:
I conjecture that the NA is due to the degenerate case of having only one treated unit, which makes the SD calculation fail when computing the SD of the treated group, but this is worth checking out in more depth.
The text was updated successfully, but these errors were encountered: