Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Phase 1 data examination results #17

Merged
merged 5 commits into from
Feb 27, 2024

Conversation

martinholmer
Copy link
Collaborator

This PR adds the examination/results.md document and provides a link to it in the high-level README.md document.

@martinholmer martinholmer changed the title Add Phase 1 examination results Add Phase 1 data examination results Feb 27, 2024
Copy link
Collaborator

@donboyd5 donboyd5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

results.md:

  • Would it be possible to add the output-file variable names for the two current-law runs:
    • CY2023 Payroll Tax Liability ($ billion) (federal employee plus employer share) -- this looks like payroll tax
    • CY2023 Individual Income Tax Liability ($ billion) (federal individual income tax) -- it looks to me like this is iitax
  • Those are some big differences vs. CBO in current law liability (especially IIT), for both taxdata and pe! We will need to dig into that.
  • As a result of this, I have added the following variables to the ad hoc analysis: c05800, taxbc, othertaxes, iitax, payrolltax. (See this, at bottom of table: https://boyd-psl-adhoc.netlify.app/analysis.html#comparison-of-weighted-sums-for-selected-variables)
  • You can see that my ad hoc run shows, at 2023 levels, for baseline 2023 law (I think), the following.
  • For iitax, I get $2,154.3 for taxdata, which about matches your $2,154.4 for taxdata. However, you get $2,012.9 for the pe phase 1 dataset but I get $1,540.4.
  • For payrolltax, I match your taxdata number and am somewhat closer for pe than I am with iitax: you have 1696.7 but I have 1630.1 --far too different to be rounding differences.
  • This suggests for PE that we're doing something different or using different data or comparing different results. I am using the Feb 20 version of the PE file (see top line here: https://boyd-psl-adhoc.netlify.app/prelims.html).
  • Maybe we can discuss when we talk tomorrow.

@donboyd5
Copy link
Collaborator

@martinholmer @nikhilwoodruff @MaxGhenis
Perhaps this is the cause of the difference for PE?

  • I stack 3 files: (1) pe, (2) taxdata as grown by you to 2023 and then run through tax-calculator, i believe, with all variables, and (3) taxdata as... with only the variables that are in pe
  • this means that the variables in pe and in td same-variables-as-pe file that are not in the grown-td all-variables file are missing
  • I set missing values to zero. I did this because I think/thought tax-calculator could not handle missing values
  • and then run through tax-calculator

I am guessing that as a result I have some important input variables in my stacked file that I set to zero, and they are affecting tax-calculator results, giving results that are not what I intended.

It seems like the fix is for me to do tax calculations in two steps:

  • create a stacked file of pe and td same-variables-as-pe and run this through tax calculator
  • run the td all-variables file through tax calculator

Then, get the two resulting files and stack them. It will have missing values for the variables that are in td but not in pe, for the pe records and for the td-same-variables-as-pe records. Leave them missing. Calculate comparisons on this file.

I think this should fix it. I had not thought through the implications of setting missing to zero in the stacked file. I'll try to do this now and will report back.

@donboyd5
Copy link
Collaborator

That fixes it. The 2023 baseline-law results in the ad hoc analysis now match the results from @martinholmer. The updated ad hoc analysis is here. The revised R code is here.

The results generally are much closer now between pe and taxdata. @nikhilwoodruff and @MaxGhenis I'm sorry for any grief or head-scratching this caused you. Still plenty of questions to investigate, but not the massive differences my erroneous earlier ad hoc analysis gave.

@martinholmer
Copy link
Collaborator Author

@donboyd5 asked in the discussion of PR #17:

Would it be possible to add the output-file variable names for the two current-law runs:

  • CY2023 Payroll Tax Liability ($ billion) (federal employee plus employer share) -- this looks like payroll tax
  • CY2023 Individual Income Tax Liability ($ billion) (federal individual income tax) -- it looks to me like this is iitax

The "output-file variable names" are at the top of the td23.res-expect and pe23.res-expect files on the second line. I have added a mention of these two files in the data examination methods document.

@martinholmer martinholmer merged commit bf28cef into PSLmodels:master Feb 27, 2024
2 checks passed
@martinholmer martinholmer deleted the examination-results branch February 27, 2024 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants