# Rebuild Income Revenue w/ Tax Credits
Evan Sellers + Michael Yager

So we previously calculated that there was `$1667 Billion` in tax revenue, but the US congress reported `$1609 Billion` in tax revenue. This is very close but off by a bit. We will be trying to recalculate this value using the tax credit values. \
[Revenues in Fiscal Year 2020](https://www.cbo.gov/system/files/2020-11/56746-MBR.pdf)

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
df = pd.read_csv("./data/tax_data_2020.csv")
df = df[df.zipcode != 0]

In [3]:
def sumColumns(dataframe, columns):
    total = 0
    for col in columns:
        total += abs(dataframe[col].sum())
    return total

In [4]:
def toMillion(amount):
    return round(amount / 1000000, 2)

def toBillion(amount):
    return round(amount / 1000000000, 2)

def toTrillion(amount):
    return round(amount / 1000000000000, 10)

## Types of Tax Credits
A tax credit takes the amount off the total final tax bill, instead of deducting it from your taxable income.

| Number Filed (Col) | Amount Filed (Col) | Form | Type of Return | Description |
| --- | --- | --- | --- | --- |
| `N07230` | `A07230` | Schedule 3 | Education | $2,500 per student, used toward course materials and tuition |
| `N07240` | `A07240` | Schedule 3 | Retirement Savings Contribution | 20% of contributions to a qualifying retirement plans |
| `N85770` | `A85770` | 8962 | Insurence Premium | Monthly insurance payment when enrolled in specific plans |
| `N07180` | `A07180` | Schedule 3 | Child/Dependent Care | Children under 13, or dependents who are unable to take care of them self |
| `N07225` | `A07225` | 1040 | w/ Child/Dependent | Having any child under 18 or dependent |
| `N07300` | `A07300` | Schedule 3 | w/ Foreign Tax | Having any taxes imposed on you by a foreign country |
| `N11070` | `A11070` | 1040 | w/ Additional Child | Having 3 or more children |
| `N07260` | `A07260` | Schedule 3 | w/ Residential Energy | Energy efficiency improvements |
| `N09400` | `A09400` | Schedule 3 | w/ Self-Employment | Being self-employed |
| `N10960` | `A10960` | 1040 | w/ Refundable Education | Having to cover the cost of higher education |
| `N11450` | `A11450` | Schedule 3 | w/ Sick/Family Leave | Having to take leave from job for sickness or family reasons |
| `N10970` | `A10970` | 1040 | w/ Recovery Rebate | Anyone one with income less than \$75K for 2021  |



In [6]:
TAX_CREDIT_AMT_1040  = [ "A07225", "A11070", "A10960", "A10970" ]
TAX_CREDIT_NUM_1040  = [ "N07225", "N11070", "N10960", "N10970" ]
TAX_CREDIT_AMT_SCH3  = [ "A07230", "A07240", "A07180", "A07300", "A07260", "A09400", "A11450", "A11560" ]
TAX_CREDIT_NUM_SCH3  = [ "N07230", "N07240", "N07180", "N07300", "N07260", "N09400", "N11450", "N11560" ]

## Rebuilding Tax Revenue
So we previously calculated that there was `$1667 Billion` in tax revenue, but the US congress reported `$1609 Billion` in tax revenue. This is very close but off by a bit.

In [19]:
# Calculated Tax Revenue
revenue = toBillion(df["A06500"].sum() * 1000)
print("$", revenue, "Billion")

$ 1667.77 Billion


In [20]:
# Difference in Calculated Tax Revenue and Reported Tax Revenue
print("$", round(toBillion(df["A06500"].sum() * 1000) - 1609, 2), "Billion")

$ 58.77 Billion


The difference is about `$58.76 Billion`, which can be accounted here. Using the 1040 tax forms we are able to account for the difference. To solve for the difference we will be comparing the reported "total" numbers with our own totals.

### Number Breakdown

In [51]:
# Calculated Credit Amount (1040)
credit1040 = toBillion(sumColumns(df, TAX_CREDIT_AMT_1040) * 1000)
print("$", credit1040, "Billion")

$ 160.76 Billion


In [52]:
# Calculated Credit Amount (Schedule 3)
creditSch3 = toBillion(sumColumns(df, TAX_CREDIT_AMT_SCH3) * 1000)
print("$", creditSch3, "Billion")

$ 94.04 Billion


In [53]:
# Credit Refunded Next Year
creditNextYear = toBillion(df["A12000"].sum() * 1000)
print("$", creditNextYear, "Billion")

$ 79.05 Billion


In [54]:
# Reported Credit Amount (1040)
reportedCredit1040 = toBillion((df["A07100"].sum()) * 1000)
print("$", reportedCredit1040, "Billion")

$ 117.94 Billion


### Calculation
Add the reported credit back into the revenue number we got. This gets us the expected revenue before subtracting credits.

In [55]:
revenueBeforeCredits = revenue + reportedCredit1040

Add our new calculated credit values for both 1040 and Schedule 3.

In [56]:
totalCredits = credit1040 + creditSch3

Finally we have to remember that some credit roll over into the next year and are not paided during this tax year. We will add that back in.

In [57]:
totalCreditsThisYear = totalCredits - creditNextYear

We will then substract this credit amount from our calculated tax revenue amount.

In [60]:
revenueAdjusted = revenueBeforeCredits - totalCreditsThisYear

In [62]:
print("$", revenueAdjusted, "Billion")

$ 1609.96 Billion


This is the exact value we were looking for! I am unsure why "income tax after credits amount" does not take this into consideration. But we are able to rebuild the value properly and get the exact number reported by congress.

## Conclusion
Unsure why but the revenue after tax credit reported in the data is off by about `$58 Billion`, but we were able to account for this difference. We did so by calculating total tax credits granted and using that to recalculate revenue after tax credit. In the end we were able to calculate a income revenue of `$ 1609.96 Billion` and congress reported `$1609 Billion`.