Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overall inequality and poverty summary statistics #1896

Closed
MaxGhenis opened this issue Mar 1, 2018 · 16 comments
Closed

Overall inequality and poverty summary statistics #1896

MaxGhenis opened this issue Mar 1, 2018 · 16 comments

Comments

@MaxGhenis
Copy link
Contributor

It would be helpful to have a function or functions to get summary statistics around inequality and poverty. These could either be part of the existing diagnostic_table function or made as new functions.

Potential metrics could include:

  • Gini coefficient based on aftertax_income, excluding negatives.
  • Share of after-tax income held by top 10/1/0.1%
  • Ratio of top 20% to bottom 20% also aftertax_income
  • Official poverty rate is defined here (this would take some more work).

More challenging ones would include:

I have some code for Gini and the other inequality ones, so can work on a PR there.

For the maintainers: do these belong in diagnostic_table, or should they be separate? Also should this issue be split into multiple for each metric?

cc @evtedeschi3 who's included the Gini coefficient and other inequality metrics in some past taxcalc analyses.

@codykallen
Copy link
Contributor

@MaxGhenis, this is an interesting idea, but I don't think Tax-Calculator is a good place to implement poverty measures. The units included come from the population of tax filers, which naturally excludes many people with little or no income, those most relevant for poverty analyses. If you're calculating a GINI coefficient, this creates the additional complication that a filing unit is neither an individual nor a household, so you would need some mechanism to either connect married couples filing separately or to split married couples filing jointly, as well as considering how to count children and non-child dependents.

That being said, if CTAM can be combined with Tax-Calculator, then the additional information from CTAM on cash and non-cash benefits could be more useful to a poverty analysis.

Also, as I've noted several times on various PRs and issues, income at the bottom of the distribution is often mismeasured.

@ernietedeschi
Copy link
Contributor

ernietedeschi commented Mar 1, 2018 via email

@feenberg
Copy link
Contributor

feenberg commented Mar 1, 2018 via email

@ernietedeschi
Copy link
Contributor

ernietedeschi commented Mar 1, 2018 via email

@MaxGhenis
Copy link
Contributor Author

if CTAM can be combined with Tax-Calculator, then the additional information from CTAM on cash and non-cash benefits could be more useful to a poverty analysis.

@codykallen Isn't that already the case? Or are there missing benefits that would be important?

Gini coefficients at the tax unit level are fine so long as the user is clear about what they are and their implications

@evtedeschi3 I agree - tc won't match other sources, but comparing Gini across reforms in my own analysis has yielded useful and directionally expected results. Incomes could also be divided by XTOT for crude individual analysis.

tc doesn’t include all of the income items Census does in their absolute poverty definition.

Which are missing? Do we know how much they're likely to affect results? Are there plans to impute them? Do other tax analysis groups use incomplete data to estimate the impact of reforms on poverty?

since the SPM is partially a relative measure (based on a percentile of consumption), that adds a whole other endogenous can of worms.

Ah I'd missed that. Seems like a deal-breaker then, and also raises concerns @feenberg described (which are valid IMO - I prefer measuring poverty in absolute terms, and then inequality separately).

These issues (tax units vs household, lack of all income sources, mismeasurement of lowest earners) are good to consider, but it seems like they also apply to any estimation of the bottom decile or so, which tc currently supports. The question would be whether they're too distorted to report at all, or can be included with caveats. And whether certain users might end up reporting them with caveats outside of tc anyway...

@MaxGhenis
Copy link
Contributor Author

It’s a complex debate, which punctuates Cody’s original point that this is probably beyond the purview of Tax-Calculator.

Does AEI have a preferred poverty measure?

@ernietedeschi
Copy link
Contributor

@MaxGhenis wrote:

Which are missing? Do we know how much they're likely to affect results? Are there plans to impute them? Do other tax analysis groups use incomplete data to estimate the impact of reforms on poverty?

Bear in mind that the official Census money income definition used in calculating the poverty rate doesn't take taxes into account at all. It is generally a pre-tax, post-transfer measure of income. You can of course modify this to account for taxes, and many researchers do, but then it's not "official". See e.g. https://www.census.gov/topics/income-poverty/poverty/about.html

@MaxGhenis
Copy link
Contributor Author

MaxGhenis commented Mar 2, 2018

the official Census money income definition used in calculating the poverty rate doesn't take taxes into account

Doh, I misread another site on this. Never mind then.

So poverty sounds hard, except for WB extreme poverty which is affected by mismeasurement at the very bottom. Some version of SPM could possibly be done by anchoring against a particular year's thresholds, like Wimer et al (2013), but the geographic part would still require some sort of national averaging.

So maybe this could just consider inequality metrics to start?

@ernietedeschi
Copy link
Contributor

ernietedeschi commented Mar 2, 2018

Here's another thing you could mull over -- but this will take some playing around on your part as I'm thinking out loud here.

The cps.csv file now includes all the variables you need to link directly to the CPS ASEC: hh_seq, ffpos, and pulineno (as well as the survey year, which will be 1 + the tax year in the raw unprocessed cps.csv file).

The CPS ASEC has each family's official and SPM poverty status, as well as the relevant income and threshold measures for each.

In principle, then, you could merge each measure in and then make some assumptions about how your policy delta in tc affects them.

So for example SPM: you will have each family's SPM income and poverty threshold. What I'm basically thinking is you merge these variables in, then add in the change in after-tax income from your policy and recalculate poverty based on that.

Some big caveats here:

  • Tax units aren't families, and in fact many CPS families are broken up into multiple tax units for the cps.csv file. You would either have to ignore split families or find a way to reconstruct / aggregate them, which now that we have all the linking variables shouldn't be too difficult in principle.

  • Inflation. The cps.csv file is taken from three separate annual samples, so the poverty variables you import will be not be inflation adjusted for whichever year you simulate policy in tc. You'll need to use the tc growth factors to either adjust the variables forward or scale your after-tax income delta back to base (and remember that as of now, CPI-U is still used for poverty purposes, not chained CPI). You also need to be aware that while the official poverty thresholds involve a simple CPI adjustment year to year, the SPM thresholds are more complex, but I don't know that there's a way to get around the fact that for SPM you'll just have to make an assumption. Maybe take a basket of SPM thresholds and compare how they've grown over the last couple years to simple CPI-U inflation.

  • The 2014 CPS ASEC, which yields the 2013 poverty variables, was, if I recall correctly, a transition year for the CPS, with part of the sample given old questions and part given new ones. I'm not sure if SPM poverty is available for that entire year's sample; if not, you'd have to come up with a way of excluding the omitted and reweighting the rest.

  • The tax bill has the ancillary effect of leading to lower health insurance coverage among low income people, which is very much germane to poverty estimates particularly SPM. Since SPM is simply measuring total resources available to a family, it's not relevant for that measure whether the fall in health insurance coverage is by choice or not.

@martinholmer
Copy link
Collaborator

@MaxGhenis said on March 1, 2018:

It would be helpful to have a function or functions to get summary statistics around inequality and poverty.

In the first few days of March, there was an informed discussion about all the pitfalls that would have to be avoided and all the subjective judgements that would have to be made in doing this.

There has been no further discussion over the past four or five weeks. Given that there is no consensus about how to do this in the Tax-Calculator library, it seems as if calculating inequality and poverty statistics is best left up to Tax-Calculator users with an interest in such statistics. That approach allows different users to make their own judgements about how to compute the statistics.

@MattHJensen
Copy link
Contributor

MattHJensen commented Apr 6, 2018

Sorry to come to this late, but I just saw something that's best for me to address:

Does AEI have a preferred poverty measure?

No, AEI does not have institutional positions. More importantly, If AEI did have an institutional position, it wouldn't be relevant to this project because the project is governed by its core maintainers, not by AEI.

As for the substance of the issue itself, I agree with @martinholmer conclusion, that:

Given that there is no consensus about how to do this in the Tax-Calculator library, it seems as if calculating inequality and poverty statistics is best left up to Tax-Calculator users with an interest in such statistics. That approach allows different users to make their own judgements about how to compute the statistics.

cc @MaxGhenis @evtedeschi3 @feenberg @martinholmer

@MaxGhenis
Copy link
Contributor Author

MaxGhenis commented Mar 17, 2019

Just saw this PSL meetup description, which looks relevant here. @evtedeschi3 are you following the approach you described in #1896 (comment)?

In a recent working paper, Mr. Tedeschi analyzes the poverty effects of the earned basic income tax credit, a proposed expansion of the current earned income tax credit. His novel approach to estimating poverty rates uses the open-source Tax-Calculator model and the Annual Social and Economic Supplement to the Current Population Survey.

If this is the Supplemental Poverty Measure, I think this is increasingly valuable for taxcalc. For example, in January, Vox reported on research from Columbia comparing the SPM effects of 5 plans from 2020 contenders (it was their front page cover story for at least a day).

Also FYI, I've added a gini function to taxcalc_helpers, which includes weights. Here's an example notebook, and the most common usage with taxcalc is:

import taxcalc_helpers as tch
df = calc.dataframe(['aftertax_income', 's006'])  # Where calc is a taxcalc Calculator.
tch.gini(df.aftertax_income, df.s006)   # Or to zero out negatives:
tch.gini(df.aftertax_income, df.s006, negatives='zero')

@ernietedeschi
Copy link
Contributor

ernietedeschi commented Mar 19, 2019 via email

@MaxGhenis
Copy link
Contributor Author

MaxGhenis commented Mar 25, 2019

Thanks @evtedeschi3 for the great presentation today on calculating SPM with ASEC and taxcalc, and your paper applying this approach to an EBITC reform (could you share your slides?). You showed code at https://github.com/evtedeschi3/tcpoverty which splits the ASEC into tax units, then after running taxcalc sums up the change in after-tax income to the SPM unit for calculating the SPM rate.

Adding these capabilities natively to taxcalc would be useful. The most involved piece would be translating your tcpov2a_make_taxsim27.R program into a Python function and adding documentation, and also the more straightforward piece of re-aggregating to SPM units and calculating SPM features.

This tax unit script is a lot simpler than the taxdata SAS scripts so I'm guessing it misses some things, but I also spoke with a Columbia poverty researcher who was doing something similar with taxcalc/ASEC, so I think it's worthwhile to have the flexibility of inputting your own ASEC. I'll need this for my own research, so if taxcalc/taxdata maintainers would prefer I can add it to my taxcalc_helpers package instead.

I'll be trying to run @evtedeschi3's process in Python in the next few days and report back how it goes.

@ernietedeschi
Copy link
Contributor

Very kind of you @MaxGhenis. I've added the slides to that repository: psl_presentation_v2.pdf

Taking a step back, I think a useful first question would be "What is the goal of 'integration' here?" These scripts are relying on data outside of what Tax-Calculator currently makes available, namely the 2018 CPS ASEC.

So there are many different "levels" of changes that would automate what I did to different extents.

Off the top of my head, the most simple approach would be to modify the CLI so that a single run will produce a dump with tax changes (rather than having to run a base and then a reform sim). And then at the same time, streamline/automate the process for taking a more recent CPS ASEC than what's used in the cps.csv file and converting it into a data file readable by Tax-Calculator.

That would allow a user to more quickly create a simulation off of a recent ASEC that she could then manually re-merge back into the ASEC and tabulate in the manner I did.

The more complicated approach would be to fully integrate poverty output into Tax-Calculator. There might be a way to do this that just involves including the SPM unit, SPM weight, SPM threshold, and SPM resource variables into the cps.csv and then automating how they're tabulated after a reform. But it requires some thought because the SPM poverty rate is measured as a percent of all people, not a percent of families or tax units. Also, if I recall correctly, the current cps.csv only draws on data from the 2013-15 ASECs; we have three newer years that researchers will likely want to be able to access for poverty estimates. And as I mentioned in the presentation, the assumptions become even more complex once we start talking about projecting multi-year SPM poverty estimates versus single year historical counterfactual poverty estimates.

@MaxGhenis
Copy link
Contributor Author

streamline/automate the process for taking a more recent CPS ASEC than what's used in the cps.csv file and converting it into a data file readable by Tax-Calculator.

I think this is the key part. Modifying the CLI to produce tax changes sounds worthwhile regardless of whether one is creating poverty statistics or other tax analysis.

Something like this is what I'd like to be able to do from the Python API (could translate to CLI):

asec = pd.read_csv('asec.csv')
recs = tc.create_asec_tax_units(asec)
base = tc.Calculator(recs, tc.Policy())
# Same for reforms, plus advance_to_year(), calc_all(), etc.
# Get change in disposable income per tax unit
comp = tc.compare(base, reform)
# Aggregate change in disposable income to the SPM unit using the original ASEC
# Also adds a column for `new_spm_resources`
comp_spm = tc.agg_spm(comp, asec)
tc.spm_rate(comp_spm)  # Calculate SPM rate for baseline and reform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants