-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repeated use restricting to e(sample) leads to different results #203
Comments
Note that repeated saving of imputed variables with different suffixes is not working:
returns an error message. This is because we are confirming vars with |
@Constantino-Carreto-Romero note the issue above #203 (comment) and the fix in 548b1a6. I think this is also an issue for #177 that we just closed, because it means that when Can you please open a new issue to check all confirms to make sure this doesn't happen? Note that for the fix in #177 you may need to move the check for replacing to a later place in the code, when the code already knows the new kvars and can confirm them individually. |
After 548b1a6 the above code works: #203 (comment) and it shows that the imputed variable is different across runs, so the issue has to do with the imputation. |
Actually the issue may be deeper since it shows up without any imputation:
|
This is not an imputation issue, and I don't think this is a bug. This code checks the estimation samples for the two runs:
This shows that the samples differ between runs and it lists some ids for where this happens. These ids are the same that are excluded in the second run because of ambiguous event times. After that if we check the sample for missing values for one of the ids:
we get Note that tenure is missing precisely when the policy variable, union2, changes from zero to one. So in the first run, this observation is excluded because of missing tenure. Now, the second run that is restricted to the estimation sample of the first run. Since that observation is not in the sample for the second run, the observations for id=13 have ambiguous event times so the entire unit is excluded. The same happens for the other ids in the sample. The same happens with the other ids with the sample. I'll now check the imputation case. |
It's a good idea to return the list of excluded units due to ambiguous event times so they can be checked. I added this in 3a84461 and 7afc04c After doing this I checked if the excluded units due to ambiguous event time when using impute(nuchange) (a larger set than without imputation) had the same issue of missing tenure when the policy activates, and they do. So the issue is the same overall. In summary I think this is not a bug. @Constantino-Carreto-Romero could you read the posts on this issue and let me know if you agree? If so we can start a PR to add these changes. |
@jorpppp |
@Constantino-Carreto-Romero Yes, this makes me think that we just need to set the sample to the one used when creating the event-time dummies. That way the results will replicate using e(sample), and at the end of the day, that is the sample that |
Summary: In this issue we addressed an issue with |
…… ( (#208) * #203 repeated use restricting to e(sample) leads to different results… (#207) * #203 confirm imputed variable with confirm, exact instead of unab * #203 Return list of excluded units due to ambiguous event time * #203 Modify help file * #203 Set sample to larger sample used to create event-time dummies * #203 #205 Add seed to test file * #206 Change endpoint dummy var labels
Consider this example
The second run over the estimation sample from the first run leads to different results. This probably has to do with the impute option.
I am going to label this as a bug for now because it looks like it.
The text was updated successfully, but these errors were encountered: