Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Medicare and Medicaid values in cps.csv.gz file #185

Merged
merged 10 commits into from
Aug 10, 2018

Conversation

andersonfrailey
Copy link
Collaborator

This PR addresses C-TAM issue #68 and implements a change to the value of Medicare and Medicaid in the CPS file based on the discussion in the aforementioned issue.

@Amy-Xu, is what I've done here what you were suggesting?

I've labeled it "WIP" because I will need to go back and update stage 3 of the CPS file creation process once I've gotten the thumbs up from @Amy-Xu.

cc @MaxGhenis

"""
# replace medicare and medicaid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code should run faster than lines 348-354:

medicare_cols = 'MCARE_VAL' + pd.Series((np.arange(16) + 1).astype(str))
medicaid_cols = 'MCAID_VAL' + pd.Series((np.arange(16) + 1).astype(str))

count_medicare = data[medicare_cols].astype(bool).sum(axis=1)
count_medicaid = data[medicaid_cols].astype(bool).sum(axis=1)

See https://drive.google.com/file/d/1Fw8rcvcERKs9llMf6dfOVAPuqwuqUUah for an example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip!

medicaid_var = 'MCAID_VAL{}'.format(i)
count_medicare += np.where(data[medicare_var] > 0, 1, 0)
count_medicaid += np.where(data[medicaid_var] > 0, 1, 0)
new_medicare = count_medicare * 12000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are 12000 and 6000 necessary here? I think they just get divided away by the scale* columns. My understanding is that the new benefit values equal the total current benefit values divided by the number of recipients. If so you can remove the new_medica* columns and just use the counts.

If I'm missing something and they are needed, WDYT about making them constants with a brief explanation of why they're those values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Anderson is implementing my proposal, which I described in the C-TAM issue. The 12k and 6k are sort of the insurance value based on imputed benefits from MEPS, even though more precisely they should be
screen shot 2018-04-26 at 5 10 15 pm

I see where you're from. That would avoid the scaler step. But one caveat is that for medicare we still want to differentiate benefit by income group.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The income upper bound refers to individual level WAS. In our case, maybe we could say if either primary or secondary earner has WAS higher than $900k, we just gave them the lower value $8776. Do we even have people with that high income in CPS tax unit dataset? If we have a good number of them, it probably worths the time to use this table. If not, I think's it's easier to go with what Max suggests. The result should be the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have people with WAS higher than $900K. I'll switch the variables to constants, it would probably be worth differentiating between those higher earners just because there's a pretty significant difference in insurance value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't $900K the upper bound for the quintile? So since it's the top quintile, it's just the maximum observed income? If so, since $0 is the upper bound for the 4th quintile, should anyone with positive income be assigned the $8,776 value? Probably some with $0 too, but that'd require some randomness; alternatively, those with positive incomes could be set to something lower than $8,776 such that the average when adjusting for the share with $0 is $8,776.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I'm wondering if the quintile table in general would be simpler and more usable for this purpose if it only has two splits, for both Medicare and Medicaid: $0 income and >$0 income.

@Amy-Xu
Copy link
Member

Amy-Xu commented Apr 26, 2018

is what I've done here what you were suggesting?

Yep! I posted the table for insurance value in my in-line comment. Being a lazy person, if I were doing this, I would define the constant at the top, and add a link to the general C-TAM documentation.

@martinholmer
Copy link
Contributor

Why exactly is the "insurance value" of Medicaid and Medicare being calculated by income group rather than for the whole population?

@Amy-Xu @andersonfrailey @MaxGhenis @MattHJensen

@Amy-Xu
Copy link
Member

Amy-Xu commented Apr 30, 2018

@martinholmer This is what Dan suggested to me. @feenberg Dan, do you mind explain in detail here? Thanks!

@martinholmer
Copy link
Contributor

martinholmer commented Apr 30, 2018

@martinholmer asked:

Why exactly is the "insurance value" of Medicaid and Medicare being calculated by income group rather than for the whole [recipient] population?

And @Amy-Xu responded:

This is what Dan suggested to me.

The research I'm familiar with computes a single actuarial value of medical benefits for the whole recipient population (not for subgroups of the recipient population). But maybe I'm unaware of other approaches. Can you provide links to research papers that compute different actuarial values for different subgroups of the recipient population?

@feenberg
Copy link

feenberg commented May 3, 2018

I can imagine that the justification for different imputed insurance values by income is that higher income people have better health. But that hardly takes account of the value to the recipient of the insurance, which can be low for low-income households who have more pressing needs. I believe the government generally values the insurance at the amount of health care spending the family would otherwise have made. This seems too low. Finkelstein has a paper on this - we could borrow her numbers:https://economics.stanford.edu/sites/default/files/valueofmedicaid_nov17_2016.pdf but it doesn't provide different estimates by income. The numbers are probably more defensible than the alternatives.

@martinholmer
Copy link
Contributor

martinholmer commented May 3, 2018 via email

@MaxGhenis
Copy link
Contributor

The difference in Medicare benefit values between those with income and those without seems significant enough to be worth modeling (30%+). Wouldn't supplemental insurance explain a lot of this?

@feenberg
Copy link

feenberg commented May 3, 2018 via email

@MaxGhenis
Copy link
Contributor

What doesn't seem sensible to me is to add the transfer to income. Sickness didn't make the person richer, why should we move them to a higher income category?

I agree with this, but I think it differs from the case of a Medicare recipient deriving less insurance value from Medicare because they have supplemental insurance. In this case it doesn't seem as out there to say the Medicare recipient without supplemental insurance is made "more richer" by Medicare than the one with supplemental insurance. That said, unless there's data on this, it's probably hard to untangle from health differences.

ACA subsidies aren't included in taxdata right? Would any of these choices affect that if they were?

@Amy-Xu
Copy link
Member

Amy-Xu commented May 10, 2018

@andersonfrailey seems like assigning every enrollees a simple average is the best at the moment according to issue PSLmodels/C-TAM#71. Could you revise the code when you get a chance?

@andersonfrailey
Copy link
Collaborator Author

@Amy-Xu, yep I'll get to work on that.

@andersonfrailey
Copy link
Collaborator Author

@Amy-Xu, can you confirm what the direction we decided to go in with this PR?

@Amy-Xu
Copy link
Member

Amy-Xu commented Jun 20, 2018

@andersonfrailey Yep, we want to assign one uniform value to every beneficiary, and the value is calculated from total cost over total beneficiaries.

@andersonfrailey
Copy link
Collaborator Author

@Amy-Xu got it. So just to be explicit, we're going to sum up the entire value of medicare and medicaid in the file, divide it by the total number of recipients, and assign that value to every beneficiary, correct?

@Amy-Xu
Copy link
Member

Amy-Xu commented Jun 20, 2018

@andersonfrailey I think so. The imputed sum should be equal to the administrative total for the benefits.

@andersonfrailey
Copy link
Collaborator Author

@Amy-Xu can you review my latest commit?

mcare_ben 1700697749
mcaid_ben 904042846
mcare_ben 1778073024
mcaid_ben 888211102
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these two the total amounts of benefits for medicare and medicaid respectively?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the unweighted totals, yes. The weighted totals didn't change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got you.

@andersonfrailey
Copy link
Collaborator Author

@Amy-Xu are you comfortable with the latest changes in this PR? Can I go ahead at merge it at the end of the day?

@Amy-Xu
Copy link
Member

Amy-Xu commented Jul 3, 2018

I think the code looks good to me, but it might be more helpful if the aggregates displayed are the weighted totals, because the changes will be more sensible and we can compare them with the administrative totals. That said, it could be an improvement later on.

@andersonfrailey
Copy link
Collaborator Author

Thanks for taking a look, @Amy-Xu. We might add a test later that shows weighted aggregates. With regards to this PR though the aggregate totals have stayed the same.

@martinholmer
Copy link
Contributor

@andersonfrailey, Why does the test-calculated unweighted sum of other_ben change in PR #185? PR #185 contains code changes that would seem to cause only minor changes in the test-calculated unweighted sums of mcare_ben and mcaid_ben.

Once #261 and #271 are merged (on Friday?), does it make sense to remove merge conflicts, and then double-check code changes, and then merge #185?

@andersonfrailey
Copy link
Collaborator Author

@martinholmer, the unweighted sum of other_ben changes because we use mcare_ben and mcaid_ben to determine the distribution of other_ben. So when we changed the distribution of mcare_ben and mcaid_ben, the unweighted sum of other_ben may change to ensure the weighted sum of other_ben is properly distributed.

Once the two PR's you mentioned are merged I believe it does make sense to remove the merge conflicts here and merge.

@martinholmer
Copy link
Contributor

@andersonfrailey said:

the unweighted sum of other_ben changes because we use mcare_ben and mcaid_ben to determine the distribution of other_ben. So when we changed the distribution of mcare_ben and mcaid_ben, the unweighted sum of other_ben may change to ensure the weighted sum of other_ben is properly distributed.

OK. Thanks for the explanation.

@andersonfrailey concluded:

Once the two PR's you mentioned [#261 and #271] are merged, I believe it does make sense to remove the merge conflicts here and merge.

OK. Do you think the merger of #185 can happen tomorrow (Friday)? When would you like me to merge #271, so that you have time to work on #185?

@martinholmer martinholmer changed the title WIP: Update Medicare and Medicaid Values for CPS Update Medicare and Medicaid values in cps.csv.gz file Aug 9, 2018
@andersonfrailey
Copy link
Collaborator Author

@martinholmer, if you merge #271 tomorrow morning I'll have time to work on #185 right after lunch and barring any unforeseen obstacles have it ready to go in the afternoon.

I personally plan on having #261 ready to go by 10 or so tomorrow morning.

@martinholmer
Copy link
Contributor

@andersonfrailey, I'll merge #271 and #272 no later than around noon, so that they will be both (along with #261) in the master branch after you get back from lunch.

@martinholmer
Copy link
Contributor

@andersonfrailey, I know you have asked @Amy-Xu several times during the development of PR #185 to check over the code changes and confirm that she thinks the code changes are doing the actuarial value calculation for Medicare and Medicaid correctly.

But I am much less familiar with the CPS data file and the code in the cps_data/finalprep.py script, so I want to ask a simple question. Consider two elderly-couple filing units. In the first couple both the husband and wife are are in good health and both are age 66. The second couple is exactly the same as the first couple except that the wife is age 64. So, after computing the actuarial value of Medicare per Medicare recipient (lets say its $15,000 per year) and applying the new logic in the finalprep.py script, the first couple will have a Medicare benefit of $30,000 and the second couple will have a Medicare benefit of $15,000. Is that correct?

If that is not correct, then we may need to think about the code changes in PR #185 in more detail. If that turns out to be true, then we should not rush to merge #185 today (Friday, August 10th).

The same sort of situation arises in Medicaid, where it is common for only the kids in a family filing unit to be Medicaid (CHIP) recipients. So, for example in a family of five (husband, wife, and three kids covered by Medicaid CHIP), the Medicaid benefit for the filing unit could be just three (not five) times the actuarial value of Medicaid per Medicaid recipient.

@MattHJensen

@andersonfrailey
Copy link
Collaborator Author

@martinholmer, assuming that both of those couples were assigned/non assigned correctly by C-TAM (two recipients in the first couple, one in the second), your example is correct.

The new code in finalprep.py first counts how many Medicare recipients there are in a tax unit (based on how many people in the unit have a nonzero Medicare value assigned by C-TAM), then assigned a Medicare value equal to number of recipients * Medicare value. The same is true for Medicaid.

@martinholmer
Copy link
Contributor

@andersonfrailey explained:

assuming that both of those couples were assigned/non assigned correctly by C-TAM (two recipients in the first couple, one in the second), your example is correct.

The new code in finalprep.py first counts how many Medicare recipients there are in a tax unit (based on how many people in the unit have a nonzero Medicare value assigned by C-TAM), then assigned a Medicare value equal to number of recipients * Medicare value. The same is true for Medicaid.

Thanks for the clarification. So, it seems as if pull request #185 is good to go (after merge conflicts are eliminated) this afternoon.

@andersonfrailey
Copy link
Collaborator Author

Latest commits bring this up to date. Should be ready to go now @martinholmer.

@martinholmer
Copy link
Contributor

@andersonfrailey, Thanks for the up-to-date version of PR #185 and for fixing my mistake in the new check_cps_benefits test function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants