Overall inequality and poverty summary statistics #1896

MaxGhenis · 2018-03-01T20:23:31Z

It would be helpful to have a function or functions to get summary statistics around inequality and poverty. These could either be part of the existing diagnostic_table function or made as new functions.

Potential metrics could include:

Gini coefficient based on aftertax_income, excluding negatives.
Share of after-tax income held by top 10/1/0.1%
Ratio of top 20% to bottom 20% also aftertax_income
Official poverty rate is defined here (this would take some more work).

More challenging ones would include:

Supplemental Poverty Measure isn't totally possible today given it relies partially on "geographic differences in housing costs" but perhaps some averaging could produce an estimate useful for comparing reforms.
Extreme poverty rate as defined by the World Bank's $1.90/day in 2011 dollars. Though see Benchmarking poorest tax units against other reports C-TAM#61 and this Brookings paper on caveats, especially income vs. consumption. This might be more for comparing reforms than as an absolute truth.

I have some code for Gini and the other inequality ones, so can work on a PR there.

For the maintainers: do these belong in diagnostic_table, or should they be separate? Also should this issue be split into multiple for each metric?

cc @evtedeschi3 who's included the Gini coefficient and other inequality metrics in some past taxcalc analyses.

The text was updated successfully, but these errors were encountered:

codykallen · 2018-03-01T21:45:47Z

@MaxGhenis, this is an interesting idea, but I don't think Tax-Calculator is a good place to implement poverty measures. The units included come from the population of tax filers, which naturally excludes many people with little or no income, those most relevant for poverty analyses. If you're calculating a GINI coefficient, this creates the additional complication that a filing unit is neither an individual nor a household, so you would need some mechanism to either connect married couples filing separately or to split married couples filing jointly, as well as considering how to count children and non-child dependents.

That being said, if CTAM can be combined with Tax-Calculator, then the additional information from CTAM on cash and non-cash benefits could be more useful to a poverty analysis.

Also, as I've noted several times on various PRs and issues, income at the bottom of the distribution is often mismeasured.

ernietedeschi · 2018-03-01T22:29:03Z

I largely agree with Cody. I will say I think Gini coefficients at the tax unit level are fine so long as the user is clear about what they are and their implications. Inequality research varies in using individuals, families, or households as the unit of choice for statistical analysis so a Gini calculation across tax units isn’t conceptually problematic, though it may not be apples-to-apples with all of the literature. I’ve played around with poverty analysis in my tc output before and it’s loaded with issues, many of which Cody touched on. tc doesn’t include all of the income items Census does in their absolute poverty definition. And since the SPM is partially a relative measure (based on a percentile of consumption), that adds a whole other endogenous can of worms. One thing I did to conceptually approximate a poverty analysis was to look at the number of filers below 50% of the median of after-tax income, adjusted for tax unit size (that is, after-tax income divided by the square root of total tax unit size). This is not the Census definition of poverty but it is a common alternative measure especially in international comparative contexts such as in OECD reports. It will give a back of the envelope estimate.

…

On Mar 1, 2018, at 4:45 PM, codykallen ***@***.***> wrote: @MaxGhenis <https://github.com/maxghenis>, this is an interesting idea, but I don't think Tax-Calculator is a good place to implement poverty measures. The units included come from the population of tax filers, which naturally excludes many people with little or no income, those most relevant for poverty analyses. If you're calculating a GINI coefficient, this creates the additional complication that a filing unit is neither an individual nor a household, so you would need some mechanism to either connect married couples filing separately or to split married couples filing jointly, as well as considering how to count children and non-child dependents. That being said, if CTAM can be combined with Tax-Calculator, then the additional information from CTAM on cash and non-cash benefits could be more useful to a poverty analysis. Also, as I've noted several times on various PRs and issues, income at the bottom of the distribution is often mismeasured. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1896 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADy7kmIWFdJV6brlnKuaRZ8GfI8LS6vAks5taGwMgaJpZM4SY77u>.

feenberg · 2018-03-01T22:45:45Z

On Thu, 1 Mar 2018, evtedeschi3 wrote: One thing I did to conceptually approximate a poverty analysis was to look at the number of filers below 50% of the median of after-tax income, adjusted for tax unit size (that is, after-tax income divided by the square root of total tax unit size). This is not the Census definition of poverty but it is a common alternative measure especially in international comparative contexts such as in OECD reports. It will give a back of the envelope estimate.

It is common, but it also tendentious in a way that the AEI would probably not like to be associated with. It is not a measure of poverty, but a measure of inequality that no amount of proportional growth can improve. dan

…

> On Mar 1, 2018, at 4:45 PM, codykallen ***@***.***> wrote: > > @MaxGhenis <https://github.com/maxghenis>, this is an interesting idea, but I don't think Tax-Calculator is a good place to implement poverty measures. The units included come from the population of tax filers, which naturally excludes many people with little or no income, those most relevant for poverty analyses. If you're calculating a GINI coefficient, this creates the additional complication that a filing unit is neither an individual nor a household, so you would need some mechanism to either connect married couples filing separately or to split married couples filing jointly, as well as considering how to count children and non-child dependents. > > That being said, if CTAM can be combined with Tax-Calculator, then the additional information from CTAM on cash and non-cash benefits could be more useful to a poverty analysis. > > Also, as I've noted several times on various PRs and issues, income at the bottom of the distribution is often mismeasured. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub<#1896 (comment) 05>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ADy7kmIWFdJV6brlnKuaRZ8GfI8LS6vAks5taG wMgaJpZM4SY77u>. > — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.[AHvQVS3n8AUnzOHMwGUcUPdBnUwOaKEqks5taHYwgaJpZM4SY77u.gif]

ernietedeschi · 2018-03-01T23:16:59Z

My comment was meant to assist Max rather than being a suggestion for a Tax-Calculator feature. But also, as I’m sure you’re aware, one could easily argue absolute measures of poverty are tendentious too. It’s a complex debate, which punctuates Cody’s original point that this is probably beyond the purview of Tax-Calculator.

…

On Mar 1, 2018, at 5:45 PM, Daniel Feenberg ***@***.***> wrote: On Thu, 1 Mar 2018, evtedeschi3 wrote: > > One thing I did to conceptually approximate a poverty analysis was to look at the number > of filers below 50% of the median of after-tax income, adjusted for tax unit size (that > is, after-tax income divided by the square root of total tax unit size). This is not the > Census definition of poverty but it is a common alternative measure especially in > international comparative contexts such as in OECD reports. It will give a back of the > envelope estimate. It is common, but it also tendentious in a way that the AEI would probably not like to be associated with. It is not a measure of poverty, but a measure of inequality that no amount of proportional growth can improve. dan > > > > On Mar 1, 2018, at 4:45 PM, codykallen ***@***.***> wrote: > > > > @MaxGhenis <https://github.com/maxghenis>, this is an interesting idea, but I don't > think Tax-Calculator is a good place to implement poverty measures. The units included > come from the population of tax filers, which naturally excludes many people with little > or no income, those most relevant for poverty analyses. If you're calculating a GINI > coefficient, this creates the additional complication that a filing unit is neither an > individual nor a household, so you would need some mechanism to either connect married > couples filing separately or to split married couples filing jointly, as well as > considering how to count children and non-child dependents. > > > > That being said, if CTAM can be combined with Tax-Calculator, then the additional > information from CTAM on cash and non-cash benefits could be more useful to a poverty > analysis. > > > > Also, as I've noted several times on various PRs and issues, income at the bottom of the > distribution is often mismeasured. > > > > — > > You are receiving this because you were mentioned. > > Reply to this email directly, view it on GitHub<#1896 (comment) > 05>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ADy7kmIWFdJV6brlnKuaRZ8GfI8LS6vAks5taG > wMgaJpZM4SY77u>. > > > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub, or mute the > thread.[AHvQVS3n8AUnzOHMwGUcUPdBnUwOaKEqks5taHYwgaJpZM4SY77u.gif] > > > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1896 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADy7kgqBHP4vlzpM--mfSNItWNUK5082ks5taHobgaJpZM4SY77u>.

MaxGhenis · 2018-03-01T23:24:11Z

if CTAM can be combined with Tax-Calculator, then the additional information from CTAM on cash and non-cash benefits could be more useful to a poverty analysis.

@codykallen Isn't that already the case? Or are there missing benefits that would be important?

Gini coefficients at the tax unit level are fine so long as the user is clear about what they are and their implications

@evtedeschi3 I agree - tc won't match other sources, but comparing Gini across reforms in my own analysis has yielded useful and directionally expected results. Incomes could also be divided by XTOT for crude individual analysis.

tc doesn’t include all of the income items Census does in their absolute poverty definition.

Which are missing? Do we know how much they're likely to affect results? Are there plans to impute them? Do other tax analysis groups use incomplete data to estimate the impact of reforms on poverty?

since the SPM is partially a relative measure (based on a percentile of consumption), that adds a whole other endogenous can of worms.

Ah I'd missed that. Seems like a deal-breaker then, and also raises concerns @feenberg described (which are valid IMO - I prefer measuring poverty in absolute terms, and then inequality separately).

These issues (tax units vs household, lack of all income sources, mismeasurement of lowest earners) are good to consider, but it seems like they also apply to any estimation of the bottom decile or so, which tc currently supports. The question would be whether they're too distorted to report at all, or can be included with caveats. And whether certain users might end up reporting them with caveats outside of tc anyway...

MaxGhenis · 2018-03-01T23:28:07Z

It’s a complex debate, which punctuates Cody’s original point that this is probably beyond the purview of Tax-Calculator.

Does AEI have a preferred poverty measure?

ernietedeschi · 2018-03-01T23:35:49Z

@MaxGhenis wrote:

Which are missing? Do we know how much they're likely to affect results? Are there plans to impute them? Do other tax analysis groups use incomplete data to estimate the impact of reforms on poverty?

Bear in mind that the official Census money income definition used in calculating the poverty rate doesn't take taxes into account at all. It is generally a pre-tax, post-transfer measure of income. You can of course modify this to account for taxes, and many researchers do, but then it's not "official". See e.g. https://www.census.gov/topics/income-poverty/poverty/about.html

MaxGhenis · 2018-03-02T00:03:37Z

the official Census money income definition used in calculating the poverty rate doesn't take taxes into account

Doh, I misread another site on this. Never mind then.

So poverty sounds hard, except for WB extreme poverty which is affected by mismeasurement at the very bottom. Some version of SPM could possibly be done by anchoring against a particular year's thresholds, like Wimer et al (2013), but the geographic part would still require some sort of national averaging.

So maybe this could just consider inequality metrics to start?

ernietedeschi · 2018-03-02T12:01:12Z

Here's another thing you could mull over -- but this will take some playing around on your part as I'm thinking out loud here.

The cps.csv file now includes all the variables you need to link directly to the CPS ASEC: hh_seq, ffpos, and pulineno (as well as the survey year, which will be 1 + the tax year in the raw unprocessed cps.csv file).

The CPS ASEC has each family's official and SPM poverty status, as well as the relevant income and threshold measures for each.

In principle, then, you could merge each measure in and then make some assumptions about how your policy delta in tc affects them.

So for example SPM: you will have each family's SPM income and poverty threshold. What I'm basically thinking is you merge these variables in, then add in the change in after-tax income from your policy and recalculate poverty based on that.

Some big caveats here:

Tax units aren't families, and in fact many CPS families are broken up into multiple tax units for the cps.csv file. You would either have to ignore split families or find a way to reconstruct / aggregate them, which now that we have all the linking variables shouldn't be too difficult in principle.
Inflation. The cps.csv file is taken from three separate annual samples, so the poverty variables you import will be not be inflation adjusted for whichever year you simulate policy in tc. You'll need to use the tc growth factors to either adjust the variables forward or scale your after-tax income delta back to base (and remember that as of now, CPI-U is still used for poverty purposes, not chained CPI). You also need to be aware that while the official poverty thresholds involve a simple CPI adjustment year to year, the SPM thresholds are more complex, but I don't know that there's a way to get around the fact that for SPM you'll just have to make an assumption. Maybe take a basket of SPM thresholds and compare how they've grown over the last couple years to simple CPI-U inflation.
The 2014 CPS ASEC, which yields the 2013 poverty variables, was, if I recall correctly, a transition year for the CPS, with part of the sample given old questions and part given new ones. I'm not sure if SPM poverty is available for that entire year's sample; if not, you'd have to come up with a way of excluding the omitted and reweighting the rest.
The tax bill has the ancillary effect of leading to lower health insurance coverage among low income people, which is very much germane to poverty estimates particularly SPM. Since SPM is simply measuring total resources available to a family, it's not relevant for that measure whether the fall in health insurance coverage is by choice or not.

martinholmer · 2018-04-06T17:08:25Z

@MaxGhenis said on March 1, 2018:

It would be helpful to have a function or functions to get summary statistics around inequality and poverty.

In the first few days of March, there was an informed discussion about all the pitfalls that would have to be avoided and all the subjective judgements that would have to be made in doing this.

There has been no further discussion over the past four or five weeks. Given that there is no consensus about how to do this in the Tax-Calculator library, it seems as if calculating inequality and poverty statistics is best left up to Tax-Calculator users with an interest in such statistics. That approach allows different users to make their own judgements about how to compute the statistics.

MattHJensen · 2018-04-06T17:25:02Z

Sorry to come to this late, but I just saw something that's best for me to address:

Does AEI have a preferred poverty measure?

No, AEI does not have institutional positions. More importantly, If AEI did have an institutional position, it wouldn't be relevant to this project because the project is governed by its core maintainers, not by AEI.

As for the substance of the issue itself, I agree with @martinholmer conclusion, that:

Given that there is no consensus about how to do this in the Tax-Calculator library, it seems as if calculating inequality and poverty statistics is best left up to Tax-Calculator users with an interest in such statistics. That approach allows different users to make their own judgements about how to compute the statistics.

cc @MaxGhenis @evtedeschi3 @feenberg @martinholmer

MaxGhenis · 2019-03-17T15:10:12Z

Just saw this PSL meetup description, which looks relevant here. @evtedeschi3 are you following the approach you described in #1896 (comment)?

In a recent working paper, Mr. Tedeschi analyzes the poverty effects of the earned basic income tax credit, a proposed expansion of the current earned income tax credit. His novel approach to estimating poverty rates uses the open-source Tax-Calculator model and the Annual Social and Economic Supplement to the Current Population Survey.

If this is the Supplemental Poverty Measure, I think this is increasingly valuable for taxcalc. For example, in January, Vox reported on research from Columbia comparing the SPM effects of 5 plans from 2020 contenders (it was their front page cover story for at least a day).

Also FYI, I've added a gini function to taxcalc_helpers, which includes weights. Here's an example notebook, and the most common usage with taxcalc is:

import taxcalc_helpers as tch
df = calc.dataframe(['aftertax_income', 's006'])  # Where calc is a taxcalc Calculator.
tch.gini(df.aftertax_income, df.s006)   # Or to zero out negatives:
tch.gini(df.aftertax_income, df.s006, negatives='zero')

ernietedeschi · 2019-03-19T18:39:25Z

The approach is in broad strokes consistent with this, though I ended up creating a synthetic 2017 CPS-based data file to run the analysis on since I wanted the latest possible SPM estimates and they are difficult to project out. Will discuss further in my presentation.

…

On Mar 17, 2019, at 11:10 AM, Max Ghenis ***@***.***> wrote: Just saw this PSL meetup description, which looks relevant here. @evtedeschi3 <https://github.com/evtedeschi3> are you following the approach you described in #1896 (comment) <#1896 (comment)>? In a recent working paper, Mr. Tedeschi analyzes the poverty effects of the earned basic income tax credit, a proposed expansion of the current earned income tax credit. His novel approach to estimating poverty rates uses the open-source Tax-Calculator model and the Annual Social and Economic Supplement to the Current Population Survey. If this is the Supplemental Poverty Measure, I think this is increasingly valuable for taxcalc. For example, in January, Vox reported <https://www.vox.com/future-perfect/2019/1/30/18183769/democrat-poverty-plans-2020-presidential-kamala-harris-booker-gillibrand> on research from Columbia comparing the SPM effects of 5 plans from 2020 contenders (it was their front page cover story for at least a day). Also FYI, I've added a gini <https://github.com/MaxGhenis/taxcalc_helpers/blob/master/taxcalc_helpers/utils.py#L5> function to taxcalc_helpers, which includes weights. The most common usage is: df = calc.to_dataframe(['aftertax_income', 's006']) gini(df.aftertax_income, df.s006) # Can also add negatives='zero' to zero out negative values. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1896 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADy7kvbNFR4j7oxVIa4FUIH8HLIgxDg5ks5vXlrVgaJpZM4SY77u>.

MaxGhenis · 2019-03-25T18:44:41Z

Thanks @evtedeschi3 for the great presentation today on calculating SPM with ASEC and taxcalc, and your paper applying this approach to an EBITC reform (could you share your slides?). You showed code at https://github.com/evtedeschi3/tcpoverty which splits the ASEC into tax units, then after running taxcalc sums up the change in after-tax income to the SPM unit for calculating the SPM rate.

Adding these capabilities natively to taxcalc would be useful. The most involved piece would be translating your tcpov2a_make_taxsim27.R program into a Python function and adding documentation, and also the more straightforward piece of re-aggregating to SPM units and calculating SPM features.

This tax unit script is a lot simpler than the taxdata SAS scripts so I'm guessing it misses some things, but I also spoke with a Columbia poverty researcher who was doing something similar with taxcalc/ASEC, so I think it's worthwhile to have the flexibility of inputting your own ASEC. I'll need this for my own research, so if taxcalc/taxdata maintainers would prefer I can add it to my taxcalc_helpers package instead.

I'll be trying to run @evtedeschi3's process in Python in the next few days and report back how it goes.

ernietedeschi · 2019-03-25T20:29:02Z

Very kind of you @MaxGhenis. I've added the slides to that repository: psl_presentation_v2.pdf

Taking a step back, I think a useful first question would be "What is the goal of 'integration' here?" These scripts are relying on data outside of what Tax-Calculator currently makes available, namely the 2018 CPS ASEC.

So there are many different "levels" of changes that would automate what I did to different extents.

Off the top of my head, the most simple approach would be to modify the CLI so that a single run will produce a dump with tax changes (rather than having to run a base and then a reform sim). And then at the same time, streamline/automate the process for taking a more recent CPS ASEC than what's used in the cps.csv file and converting it into a data file readable by Tax-Calculator.

That would allow a user to more quickly create a simulation off of a recent ASEC that she could then manually re-merge back into the ASEC and tabulate in the manner I did.

The more complicated approach would be to fully integrate poverty output into Tax-Calculator. There might be a way to do this that just involves including the SPM unit, SPM weight, SPM threshold, and SPM resource variables into the cps.csv and then automating how they're tabulated after a reform. But it requires some thought because the SPM poverty rate is measured as a percent of all people, not a percent of families or tax units. Also, if I recall correctly, the current cps.csv only draws on data from the 2013-15 ASECs; we have three newer years that researchers will likely want to be able to access for poverty estimates. And as I mentioned in the presentation, the assumptions become even more complex once we start talking about projecting multi-year SPM poverty estimates versus single year historical counterfactual poverty estimates.

MaxGhenis · 2019-03-26T17:00:57Z

streamline/automate the process for taking a more recent CPS ASEC than what's used in the cps.csv file and converting it into a data file readable by Tax-Calculator.

I think this is the key part. Modifying the CLI to produce tax changes sounds worthwhile regardless of whether one is creating poverty statistics or other tax analysis.

Something like this is what I'd like to be able to do from the Python API (could translate to CLI):

asec = pd.read_csv('asec.csv')
recs = tc.create_asec_tax_units(asec)
base = tc.Calculator(recs, tc.Policy())
# Same for reforms, plus advance_to_year(), calc_all(), etc.
# Get change in disposable income per tax unit
comp = tc.compare(base, reform)
# Aggregate change in disposable income to the SPM unit using the original ASEC
# Also adds a column for `new_spm_resources`
comp_spm = tc.agg_spm(comp, asec)
tc.spm_rate(comp_spm)  # Calculate SPM rate for baseline and reform.

martinholmer added enhancement request labels Mar 2, 2018

martinholmer closed this as completed Apr 6, 2018

MaxGhenis mentioned this issue Feb 7, 2019

FYI: Comparison to Stanford's California Poverty Measure PSLmodels/C-TAM#78

Open

MaxGhenis mentioned this issue Feb 25, 2019

Supplemental Poverty Measure PSLmodels/microdf#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overall inequality and poverty summary statistics #1896

Overall inequality and poverty summary statistics #1896

MaxGhenis commented Mar 1, 2018

codykallen commented Mar 1, 2018

ernietedeschi commented Mar 1, 2018 via email

feenberg commented Mar 1, 2018 via email

ernietedeschi commented Mar 1, 2018 via email •

edited

Loading

MaxGhenis commented Mar 1, 2018

MaxGhenis commented Mar 1, 2018

ernietedeschi commented Mar 1, 2018

MaxGhenis commented Mar 2, 2018 •

edited

Loading

ernietedeschi commented Mar 2, 2018 •

edited

Loading

martinholmer commented Apr 6, 2018

MattHJensen commented Apr 6, 2018 •

edited

Loading

MaxGhenis commented Mar 17, 2019 •

edited

Loading

ernietedeschi commented Mar 19, 2019 via email

MaxGhenis commented Mar 25, 2019 •

edited

Loading

ernietedeschi commented Mar 25, 2019

MaxGhenis commented Mar 26, 2019

Overall inequality and poverty summary statistics #1896

Overall inequality and poverty summary statistics #1896

Comments

MaxGhenis commented Mar 1, 2018

codykallen commented Mar 1, 2018

ernietedeschi commented Mar 1, 2018 via email

feenberg commented Mar 1, 2018 via email

ernietedeschi commented Mar 1, 2018 via email • edited Loading

MaxGhenis commented Mar 1, 2018

MaxGhenis commented Mar 1, 2018

ernietedeschi commented Mar 1, 2018

MaxGhenis commented Mar 2, 2018 • edited Loading

ernietedeschi commented Mar 2, 2018 • edited Loading

martinholmer commented Apr 6, 2018

MattHJensen commented Apr 6, 2018 • edited Loading

MaxGhenis commented Mar 17, 2019 • edited Loading

ernietedeschi commented Mar 19, 2019 via email

MaxGhenis commented Mar 25, 2019 • edited Loading

ernietedeschi commented Mar 25, 2019

MaxGhenis commented Mar 26, 2019

ernietedeschi commented Mar 1, 2018 via email •

edited

Loading

MaxGhenis commented Mar 2, 2018 •

edited

Loading

ernietedeschi commented Mar 2, 2018 •

edited

Loading

MattHJensen commented Apr 6, 2018 •

edited

Loading

MaxGhenis commented Mar 17, 2019 •

edited

Loading

MaxGhenis commented Mar 25, 2019 •

edited

Loading