Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Test to Check Reform Consistency for Itemized Deduction Surtax and Itemized Deduction Haircut #749

Closed
wants to merge 19 commits into from

Conversation

GoFroggyRun
Copy link
Contributor

@GoFroggyRun GoFroggyRun commented May 12, 2016

This PR continues to resolve a bug which was found in #490 by @MattHJensen. And the issue has been mostly identified in #748, which is that the BenefitSurtax function doesn't seem to be working properly. Specifically, the itemized deduction switch won't be turning on for the AMTI function, when the reform is implemented using _ID_BenefitSurtax_crt parameter, or BenefitSurtax function.

In particular, this PR does three things:

  1. Moved the BenefitSurtax function from functions.py to calculate.py, where I think the latter one seems to be a more appropriate place to put this function.
  2. Added a test in test_calculate.py that examines the consistency of implementing the same reform in two different ways: one specifies all seven ID_xxxx_HC items (to 1); while the other uses _ID_BenefitSurtax_crt parameter (set to 0) and BenefitSurtax function. The test has passed using the full 09puf on my local machine, and was then switched to using 91puf as input for the purpose of Travis CI testing.
  3. Identified and ("ugly") resolved the malfunction of itemized dedution parameters not being properly suppressed in AMTI function, when the reform is being implemented using BenefitSurtax function. Another switch has been added to the AMTI function to make sure everything is working properly. I'd appreciate any ''smarter" ideas to fix this though.

The TAXSIM validation continues to pass, and the changes do not affect any puf result or any mtr result at all. There's one slight change in reform_results.txt, which I believe is moving toward the right direction, given that the BenefitSurtax function is not properly working beforehand.

Any comments, concerns or remarks would be appreciated. Especially on how to fix this bug in a "smarter" way.
@martinholmer @MattHJensen @Amy-Xu

@martinholmer
Copy link
Collaborator

Sean, Do you see the error message that describes why the py.test suite failed?

@martinholmer
Copy link
Collaborator

For those following the new pull request #749, this is follow-on work from the late-2015 pull request #490, which found that two different ways of completely eliminating itemized deductions do not produce exactly the same results. Both pull requests include a new test test_ID_hc_vs_surtax, but #490 ran the test on the unrealistic 1991 data while #749 runs the test on the more realistic puf.csv data. The #749 test failure message is not very informative:

assert_array_almost_equal(calc1.records._combined, calc2.records._combined, decimal=2)
E       AssertionError: 
E       Arrays are not almost equal to 2 decimals
E       
E       (mismatch 0.190160772289%)
E        x: array([    73901.57,  29986897.74,  34929307.3 , ...,         0.  ,
E                      0.  ,         0.  ])
E        y: array([    73901.57,  29986897.74,  34929307.3 , ...,         0.  ,
E                      0.  ,         0.  ])

By adding the following statements to the test

dif = np.around((calc1.records._iitax - calc2.records._iitax), decimals=2)
absdif = np.absolute(dif)
print np.sum(absdif > 0.01)

we see that, out of the 219,814 filing units in the puf.csv file, only 418 have income tax liability differences greater than one cent in absolute value.

But the changes in the code so far (pull request #749 is still a work-in-progress) have the undesirable effect of creating massive failures in the validations tests. So, for example, using the d15 validation sample of 100,000 unit, we have gone from having no income tax liability differences of more than one cent to having 91,801 differences of more than one cent between Tax-Calculator (under pull request #749) and Internet TAXSIM, when both models are simulating current-law policy.

And the code changes so far also have the effect of creating large increases in aggregate income tax revenues under current-law policy: a rise of about 7.7 percent in 2013 and a rise of about 8.3 percent in 2022. Do we think Tax-Calculator is that far off in simulating income tax revenue under current law?

I don't understand why these changes in the validation test results and the requires_pufcsv unit tests are not being included in the commits in this pull request. The incomplete commits give a misleading impression of the implications of the proposed code changes.

@MattHJensen

@GoFroggyRun
Copy link
Contributor Author

GoFroggyRun commented May 12, 2016

@martinholmer:
This is caused by a typo in one of my commit. This PR will have no change toward any puf results at all. I'll fix the typo very soon.

@GoFroggyRun
Copy link
Contributor Author

@martinholmer:
Please refer to the most recent commit for all file changes, and I have also updated the main thread for this PR.

@GoFroggyRun GoFroggyRun changed the title [WIP] Added Test to Check Reform Consistency Added Test to Check Reform Consistency May 13, 2016
@martinholmer
Copy link
Collaborator

Sean (@GoFroggyRun),
This version of pull request #749 is much better; thank you for fixing this bug in Tax-Calculator.

I have been able to confirm that income tax liability is exact the same for each of the puf.csv filing units under these two reforms that eliminate itemized deductions.

I have just two small requests.

(1) Can you add to this conversation a description about which change actually fixed the bug? A number of the code changes in this pull request seem to be debugging aids, not changes in tax-calculation logic. What change (or changes) were essential in eliminating the bug?

(2) Can you remove from the functions.py file the line that says import copy because it is never used, and thus, raises a pylint warning about unused imports.

If you can do both of these things this afternoon, I'll be happy the merge this pull request before I go out-of-town for a week.

@MattHJensen

@codecov-io
Copy link

codecov-io commented May 13, 2016

Current coverage is 95.07%

Merging #749 into master will increase coverage by 7.86%

  1. 5 files (not in diff) in taxcalc were modified. more
    • Misses -97
    • Hits +106
  2. File taxcalc/records.py was modified. more
    • Misses -13
    • Partials 0
    • Hits +31
  3. File ...c/parameters_base.py (not in diff) was deleted. more
  4. File ...axcalc/parameters.py (not in diff) was created. more
@@             master       #749   diff @@
==========================================
  Files            12         12          
  Lines          1720       1746    +26   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           1500       1660   +160   
+ Misses          220         86   -134   
  Partials          0          0          

Powered by Codecov. Last updated by f384018...c836309

@GoFroggyRun
Copy link
Contributor Author

GoFroggyRun commented May 13, 2016

@martinholmer:

Sure. I have removed import copy from functions.py since this command is no longer used. Thanks for your reminder.

Regarding your other request:
The part of this PR that fixes the bug is actually within function AMTI, where the code line changes read:

if ID_BenefitSurtax_crt != 1:
     ID_Medical_HC = 1
     ID_Miscellaneous_HC = 1
     ID_StateLocalTax_HC = 1
     ID_RealEstate_HC = 1

My finding is that these ID_xxxx_HC parameters were not set to value 1 as desired in the AMTI function (which I'm not quite sure why this is the case), and thus another switch has been added to AMTI function to make sure all itemized items have indeed been suppressed.

@martinholmer
Copy link
Collaborator

Sean said:

The part of this PR that fixes the bug is actually within function AMTI, where the code line changes read:

if ID_BenefitSurtax_crt != 1:
     ID_Medical_HC = 1
     ID_Miscellaneous_HC = 1
     ID_StateLocalTax_HC = 1
     ID_RealEstate_HC = 1

My finding is that these ID_xxxx_HC parameters were not set to value 1 as desired in the AMTI function (which I'm not quite sure why this is the case), and thus another switch has been added to AMTI function to make sure all itemized items have indeed been suppressed.

What a minute. Now that you point out this AMTI function code change and provide an English explanation of why you added these lines, I have my doubts about the validity of this pull request #749.

Maybe I don't understand the nature of these two types of reforms, but what happens when a user specifies a reform with ID_BenefitSurtax_crt equal to 0.02 and each of the seven ID_*_HC parameters equal to 0.10? The code above will overwrite four of the seven HC parameters. Is that right? I think you need another test, using just one filing unit, to check that the code in this pull request produces the correct (as calculated by hand) answer with these values of the reform parameters.

@MattHJensen @feenberg

@GoFroggyRun
Copy link
Contributor Author

@martinholmer:

Yes, you are absolutely right. If ID_BenefitSurtax_crt and one or more ID_xxx_HC items are being considered simultaneously, the results won't be right. Let me see what's the best way to resolve this.

I don't think I'll be able to get this done by the end of today, so please enjoy your vacation. And I'll figure this out as soon as I can.

@martinholmer
Copy link
Collaborator

martinholmer commented May 13, 2016

Sean said:

I don't think I'll be able to get this done by the end of today, so please enjoy your vacation.

I would never expect you to be able to investigate this issue that fast. This is a complex problem that will probably take days to solve. Don't rush this. Let's get the logic correct in this pull request.

Just remember that Matt's original test was just a corner (or extreme) situation in which itemized deductions were completely eliminated in two different ways: (1) by full itemized-deduction haircuts, and (2) by turning the itemized deductions into credits but limiting the credit to be no more than zero percent of AGI. But most real-world reforms of this type will be either partial haircuts or limited credits (in lieu of deductions), or some combination of the two (as in my example).

I'll be back in my offices on Tuesday, May 24th.

@MattHJensen @feenberg

@feenberg
Copy link
Contributor

On Fri, 13 May 2016, Martin Holmer wrote:

Sean said:

  I don't think I'll be able to get this done by the end of today, so please
  enjoy your vacation.

I would never expect you to be able to investigate this issue that fast. This is a
complex problem that will probably take days to solve. Don't rush this. Let's get the
logic correct in this pull request.

Just remember that Matt's original test was just a corner (or extreme) situation in
which itemized deductions were completely eliminated in two different ways: (1) by
full itemized-deduction haircuts, and (2) by turning the itemized deductions into
credits but limiting the credit to be no more than zero percent of AGI. But most
real-world reforms of this type with be either partial haircuts or limited credits (in
lieu of deductions), or some combination of the two (as in my example).

Neither of these would affect the deductions shown as preferences on the
6251 for the AMT. I wonder if this is giving a misleading result. If there
is no tax benefit from a line on the Schedule A, are we adding that amount
on to C62100 on Form 6251?

dan

I'll be back in my offices on Tuesday, May 24th.

@MattHJensen @feenberg


You are receiving this because you were mentioned.
Reply to this email directly or view it on
GitHub[AHvQVVh13SyF3F-qzg9a9cygvoBLyK-bks5qBMvpgaJpZM4IdQFA.gif]

@GoFroggyRun
Copy link
Contributor Author

As described in #752, the problem unit in 09 puf that results in test failure is, although extremely rare, still possible and completely legit tax-logic-wise.

After a conversation with @MattHJensen, we think that, although there're some rare cases that the test added might not pass, it should still hold for almost all units in general, and thus we might want to give an exclusion to such special, and counter intuitive, case in #752.

What I have in mind is that, given we've already know the fundamental reason that give rise to #752, we could add an exclusion in the test suite, according to the findings in #752, so that all other units can still be tested for SurtaxBenifit. Does this sound a reasonable approach?

Comments, concerns or remarks would be appreciated.
@martinholmer @feenberg @MattHJensen

@martinholmer
Copy link
Collaborator

Sean @GoFroggyRun,
I wonder whether the logic changes in pull request #749 are sensible given the following change in results from the taxcalc/comparison/reform_results.txt file:

  limit the tax value of ID to 6% of AGI
 -Tax-Calculator,20.0,21.4,22.8,24.4
 +Tax-Calculator,1.8,1.9,2.3,2.8
  Budget Options,11,9,8,7

So, in the fourth year, the proposed change in logic causes Tax-Calculator to switch from an estimate of $24.4 billion to $2.8 billion. That is a reduction of about 88 percent. That is a very large reduction. Can you provide a clear English explanation for this enormous reduction?

Also, I wonder why the Budget Options estimate is declining over time while the Tax-Calculator estimates are rising over time. Can you provide an explanation of that difference?

@GoFroggyRun
Copy link
Contributor Author

GoFroggyRun commented May 25, 2016

@martinholmer:

Thanks for your comment regarding the taxcalc/comparison/reform_results.txt file.

The change in the reform_results.txt file was resulted from one of my previous code line change, which has now been reverted. So right now the file has been updated, and there's only rather slight increment to that particular item limit the tax value of ID to 6% of AGI.

A big change I have made to this added test suite is that, given we have known the reason that give rise to the "bug" in #752, I then excluded units who would get better-off when choose to have no deduction, and thus made this test applicable to other "normal" units.

And after this change being applied, this test will pass when fed by the full puf file. Also, after the latest commit, there's no change in functions.py file, meaning that no tax logic has been modified.

All tests passed, expect one slight change in reform_results.txt file, item limit the tax value of ID to 6% of AGI, which seems to be some rounding discrepancy to me.

@martinholmer Please take a look.

@martinholmer
Copy link
Collaborator

@GoFroggyRun,

Can you clarify in the comment just before the following lines what is going on here?

epsilon = 0.01
iitgap = calc1.records._iitax - calc2.records._iitax - epsilon
amtgap = calc1.records.c63200 - calc2.records.c63200
taxgap = calc1.records._taxbc - calc2.records._taxbc
exclude_REC = calc1.records.RECID[(iitgap < 0) & (taxgap + amtgap < epsilon)]

@GoFroggyRun
Copy link
Contributor Author

@martinholmer:
Thanks for your comment. I have added more comments before those code lines. Please review.

@martinholmer
Copy link
Collaborator

Sean, I don't see any absolute-value calculations in your code below:

epsilon = 0.01
# calculate the gap of individual income tax
iitgap = calc1.records._iitax - calc2.records._iitax - epsilon
# calculate the gap of alternative minimum tax
amtgap = calc1.records.c63200 - calc2.records.c63200
# calculate the gap of income tax
taxgap = calc1.records._taxbc - calc2.records._taxbc
# exclude records with larger, in terms of absolute value, amt gap than tax
# gap, which would result in lower iit liability
exclude_REC = calc1.records.RECID[(iitgap < 0) & (taxgap + amtgap < epsilon)]

So, I don't understand the phrase "in terms of absolute value". Can you clarify?

Also, how many records (number and percent) are excluded by the last statement above?

@GoFroggyRun
Copy link
Contributor Author

@martinholmer Sure. So, as described in #752, the only case where some unit would get better-off when choose to have zero deduction amount is that his/her AMT liability gap overwhelms his/her regular tax liability gap.

More specifically, in the two calculators defined in the test suite, calc1 refers to this special case where a unit has the chance to choose zero deduction amount. And if this is the case, calc1's regular tax liability will always be greater than calc2's, as there's no deduction for calc1. The only possibility calc1 could end up with a lower individual income tax liability than calc2 is that calc1's AMT liability needs to be even lower, when compared with the taxgap, than calc2's. That said, in this case,

taxgap = calc1.records._taxbc - calc2.records._taxbc > 0
amtgap = calc1.records.c63200 - calc2.records.c63200 < calc2.records._taxbc - calc1.records._taxbc < 0,

and thus taxgap + amtgap < epsilon is just comparing whether |amtgap| > |taxgap| if we ignore the relaxation term epsilon. The reason why I didn't adopt the latter comparison, |amtgap| > |taxgap|, is because I don't want to include the scenario when amtgap - taxgap > 0.

And regarding your second concern, the only unit being excluded is record with RECID == 1072110 in 09puf with weight 1020.73. Also, no unit will be excluded when using 91puf to perform this test.

@martinholmer
Copy link
Collaborator

Sean @GoFroggyRun, Thanks for the more complete explanation of the exclusion of records in the new test. @MattHJensen and Dan @feenberg, do either of you have any concerns about these changes in pull request #749? If not, should I merge this pull request into the master branch?

@GoFroggyRun GoFroggyRun changed the title Added Test to Check Reform Consistency Added Test to Check Reform Consistency for Itemized Deduction Surtax and Itemized Deduction Haircut May 27, 2016
@martinholmer
Copy link
Collaborator

Sean (@GoFroggyRun) said in the discussion of pull request #722:

#749 is intended to add an explicit test that examines the consistency of addressing ID surtax reform and ID haircut reform

So, if #749 just adds a test, why does it include so many changes to the tax-calculation logic? From what you are saying it seems like you need to eliminate many of the code changes in pull request #749.

@MattHJensen

@GoFroggyRun
Copy link
Contributor Author

GoFroggyRun commented May 27, 2016

@martinholmer:

Apart from adding a test suite, this PR also rearranges our standard deduction vs Itemized deduction optimization into a separate function called STD_vs_ITM, instead of in function calc_one_year. Moreover, BenefitSurtax has been moved from functions.py to calculate.py.

Although the above changes don't have any impact toward tax-logic, they might explain why you are seeing massive code-line changes.

@MattHJensen
Copy link
Contributor

@GoFroggyRun, could you explain why you think BenefitSurtax is better in calculate.py than functions.py? My initial reaction is that BenefitSurtax implements a feature of tax law (albeit one that is zeroed out under current law) so it is better in functions.py.

In the future it may be helpful to break these kinds of changes into separate PRs. If the core part of this PR, the test, were in a separate PR then it wouldn't be held up by secondary discussions about topics such as the placement of BenefitSurtax and it would be easier to read the diffs.

@GoFroggyRun
Copy link
Contributor Author

@MattHJensen:

could you explain why you think BenefitSurtax is better in calculate.py than functions.py? My initial reaction is that BenefitSurtax implements a feature of tax law (albeit one that is zeroed out under current law) so it is better in functions.py.

Sure. At the beginning, it seems to me that the BenefitSurtax function is more like a customized way of implementing surtax reform. Since, if I were not mistaken, in functions.py, we don't have any code line that reevaluates any policy parameters. From this perspective, I don't feel quite comfortable to put it in functions.py. But what you said also makes a lot of sense, and now I think putting it in functions.py might be more appropriate tax-law-wise.

In the future it may be helpful to break these kinds of changes into separate PRs. If the core part of this PR, the test, were in a separate PR then it wouldn't be held up by secondary discussions about topics such as the placement of BenefitSurtax and it would be easier to read the diffs.

Sure. Got you!

@MattHJensen
Copy link
Contributor

MattHJensen commented Jun 10, 2016

@GoFroggyRun said:

now I think putting it in functions.py might be more appropriate tax-law-wise.

Sounds good. Could you leave a comment here after you push the new commit?

@GoFroggyRun
Copy link
Contributor Author

@MattHJensen:
Just brought the BenefitSurtax function back into functions.py. Please take a look.

@martinholmer
Copy link
Collaborator

The need for pull request #749 has been eliminated by the merger of pull request #842.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants