Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impossibly low pre-Maria mortality rate #7

Closed
davidkane9 opened this issue Jun 10, 2018 · 8 comments

Comments

@davidkane9
Copy link

commented Jun 10, 2018

I calculate a mortality rate of 2.6 deaths (95% confidence interval [CI], 1.4 to 3.8)) per 1000 persons from January 1 through September 19, 2017. This mortality rate is inconsistent with the rate calculated from the official monthly statistics for 2010 through 2017: 8.3 deaths (95% CI, 8.2 to 8.4) per 1000 persons. It is, statistically, almost impossible for there to have been only 18 deaths in the 3299 households prior to Maria. The problem remains even with the authors’ later calculations adjusting for household size.

@rafalab

This comment has been minimized.

Copy link
Collaborator

commented Jun 11, 2018

This is consistent with our observation that deaths in small households were undercounted (as stated in the paper). This code implements a very conservative adjustment that yields a 95% confidence interval of [2.5,7.5]. Why is this a conservative adjustment? If you go through that code carefully you see that the households of size 1 have a median age of 69. In our code we plug-in the overall rate for the missing households. But if you instead plug-in a rate consistent with the death rate for 69 year olds, say 20 per 1,000, then the adjusted rate goes up to [4.3, 11.7].

@davidkane9

This comment has been minimized.

Copy link
Author

commented Jun 11, 2018

This is consistent with our observation that deaths in small households were undercounted

Sure. But it is "consistent" with lots of other things as well, like problems in survey design, data collection and so on.

a very conservative adjustment that yields a 95% confidence interval of [2.5,7.5]

Correct me if I am wrong, but this estimate is almost (statistically) impossible, if you believe that the 2010-2017 official mortality counts are accurate. Again, that data shows 8.3 deaths (95% CI, 8.2 to 8.4) per 1000 persons. If that is the truth (and the survey data/methodology is accurate), it is impossible to have observed so few deaths pre-Maria, even if we allow for this conservative adjustment.

if you instead plug-in a rate consistent with the death rate for 69 year olds, say 20 per 1,000, then the adjusted rate goes up to [4.3, 11.7]

What would this adjustment do to your estimate of post-Maria mortality? Back of the envelope, wouldn't it go to an (implausibly?) high number like 30? We can only evaluate the plausibility of an adjustment if we consider its effect on both the pre- and post-Maria estimates. (Or, I guess one could argue that some adjustments are necessary pre- but not post-, or vice versa. But that does not seem to be your argument.)

Again: HUGE KUDOS to everyone involved in this project for your openness and transparency. This is the way that science ought to be done.

@rafalab

This comment has been minimized.

Copy link
Collaborator

commented Jun 11, 2018

Sure. But it is "consistent" with lots of other things as well, like problems in survey design, data collection and so on.

If you have any statistical evidence that any of these things happened please share them. But note that we did not use the before hurricane rate estimates for any of our analysis.

What would this adjustment do to your estimate of post-Maria mortality? Back of the envelope, wouldn't it go to an (implausibly?) high number like 30?

No need for back of the envelope calculations. You can edit the code, which is public, to find the answer. I went ahead and ran it: the CI is [12.6 , 24.3], not 30.

(Or, I guess one could argue that some adjustments are necessary pre- but not post-, or vice versa. But that does not seem to be your argument.)

One bias that might affect the before more than the after is recall bias. But, again, note that we did not use the before hurricane rate estimates for any of our analysis.

@davidkane9

This comment has been minimized.

Copy link
Author

commented Jun 13, 2018

any statistical evidence that any of these things happened

I have statistical evidence that something happened which produced implausible data. One can believe the 18 deaths from before Maria (and then engage in a bunch of upward adjustments to make the resulting mortality estimate more plausible) or one can believe in the 38 deaths post-Maria (and then engage in some downward adjustments to make the resulting mortality estimates more plausible) but it is very hard to credit both numbers at the same time.

we did not use the before hurricane rate estimates

That is a bug, not a feature, of the paper. It is good that you compare various measures from the survey with ACS data. Excellent stuff! Anytime we compare survey data with other sources in order to confirm the accuracy of the survey, we are doing good science. But then why didn't you compare the pre-Maria mortality rate from the survey with the official statistics? That seems a big oversight. After all, this is a paper about mortality! There is no more important way to judge the quality of the survey than to see if it got pre-Maria mortality "correct."

Either doing that comparison failed to occur to you or you did do it and failed to report it. Neither option is a good look.

the CI is [12.6 , 24.3], not 30

Doesn't that raise all sorts of alarm bells for you, now that you have access to official data? Again, this is not a critique of the paper since that data was not available to you then. You provide this (excellent!) analysis using the official data which suggests 1,750 or so excess deaths. (That seems a reasonable estimate to me.) But the center of the CI you use above would be more like 7,000 deaths and (back of the envelope) I suspect that its lower bound would exclude your own estimate based on official data.

In other words, if you use the same adjustments on post-Maria estimates, you get absurdly high numbers of excess deaths. But if you don't use those adjustments on pre-Maria estimates, you get an absurdly low mortality rate.

I find it very hard to believe, simultaneously, that these families suffered 18 deaths per-Maria and 38 deaths post-Maria. One or the other might be true. But, if you use the same methodology on both, you either end up with pre-Maria mortality rates that are too low or post-Maria mortality rates that are too high.

@rafalab

This comment has been minimized.

Copy link
Collaborator

commented Jun 13, 2018

`` I find it very hard to believe, simultaneously, that these families suffered 18 deaths per-Maria and 38 deaths post-Maria.

We did was well. We therefore did not use both pre and post hurricane data. For the main estimate presented in the paper, we did assume the post-hurricane data was unbiased. But we did not think the pre-hurricane data was usable without further exploration into explanations for the bias. Recall bias, for example, can affect the pre much more than the post.

We do not claim our approach is perfect nor final and in the paper we explicitly say that "we make the data available for additional analyses". Now that the government has finally made the official data public we can see which adjustments end up matching these data better as a way to develop approaches going forward. Of course, we couldn't do that for the paper because the government did not share the data with us and only made it public three days after the paper was published. Although I do think the possibility that these official data are missing some deaths is real. At the moment it is hard to tell how much.

I should mention that, personally, I think putting so much emphasis on the excess count estimate is a mistake. Even if we find the perfect adjustment, the estimate's uncertainty will remain very high. My main take aways from studying this dataset, which includes more than death counts, were: 1 - The official count of 64 is clearly an undercount (based on our data and other reports from groups that apparently had access to official counts). 2- The interruption of services was concerning (Figure 3). 3- The reported causes of death were concerning: lack of access to medical care for example (Figure 4B). 4- The effect of the hurricane went well beyond September (Figure 4B). I hope the government uses this information to better prepare for the next hurricane.

@davidkane9

This comment has been minimized.

Copy link
Author

commented Aug 7, 2018

Just wanted, again, to congratulate everyone involved with this paper for their transparency. This is the way all science should be done!

@mkiang

This comment has been minimized.

Copy link
Collaborator

commented Aug 9, 2018

Thanks, @davidkane9.

Not sure if you saw it, but we've elaborated on the pre-hurricane mortality in the Technical FAQ (dated 7/15/2018).

If this is sufficiently addressed for you, I'd like to close the issue. You're obviously welcome to submit new ones if you come across them.

@davidkane9

This comment has been minimized.

Copy link
Author

commented Aug 9, 2018

Feel free to close the issue. I hope to have some other comments on different estimands for this study, but those probably belong elsewhere.

@mkiang mkiang closed this Aug 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.