# Auto-Notebook Documentation

This notebook provides documentation of how to use the auto_notebook tool to generate [Tax-Calculator](https://github.com/open-source-economics/Tax-Calculator.git) reports.

The auto-notebook tool leverages the PyLaTeX package to generate single-page reports of policy reforms simulated using Tax-Calculator. To properly execute auto-notebook it is necessary to have the [PyLaTeX](https://jeltef.github.io/PyLaTeX/latest/) library installed and have two Python files in your operating directory: [`auto_func.py`](https://github.com/econ02/auto_taxcalc_report/blob/master/auto_func.py) and [`auto_calc.py`](https://github.com/econ02/auto_taxcalc_report/blob/master/auto_calc.py). These Python files contain the necessary commands to produce the auto-report.

Additional auto-notebook examples can be found on [GitHub](https://github.com/econ02/auto_taxcalc_report/tree/master/examples).

The only packages needed to run the auto-notebook are `taxcalc` and `collections`.

In [1]:
from taxcalc import *
from collections import OrderedDict

All additional packages are imported through the auxillary Python files, `auto_func.py` and `auto_calc.py`. The following packages are included:

In [3]:
import re 
import os
import sys
import math
import copy
import locale
import datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from taxcalc import *
from decimal import *
from pylatex.utils import italic, NoEscape
from pylatex.base_classes import Container, Command
from pylatex import Document, PageStyle, Head, Foot, MiniPage, Section, \
                    Command, StandAloneGraphic, MultiColumn, MultiRow, Tabu, \
                    LongTabu, LargeText, MediumText, LineBreak, NewPage, \
                    Tabularx, Tabular, TextColor, simple_page_number, Figure, \
                    Subsection, Math, TikZ, Axis, Plot, Figure, Matrix, Itemize

To begin auto_notebook, you must first define a start year for each simulation and the path of the data: `records_url`.

In [4]:
"""
Dashboard Settings
"""

start_year = 2017
records_url = 'puf.csv'

The next step in auto_notebook is to define the behavioral settings for the simulations. There are two types of behavioral settings: `tyc_beh` and `beh_`. The `tyc_beh` setting only applies to dynamic ten year cost estiamtes but does not apply to the remainder of the simulation. The `beh_` settings allows for the calculator to perform under a number of different behavioral settings. Each desired setting should be defined and included in `beh_dict`, and ordered dictionary that allows the auto_notebook functions to apply each behavioral assumption. Note that the reference name in the ordered dictionary will be the term used to refer to that behavior when any summary tables are produced. It is not necessary to define mutliple behaviors or include any dynamics. 

In [10]:
"""
Behavioral Settings
"""
tyc_beh = {start_year: {'_BE_inc': [0.0], 
                        '_BE_sub': [0.4], 
                        '_BE_cg': [0.0]}}

# The following name will be used as the comparative baseline case in the calculation of formulas and automated notes
beh_0 = {start_year: {'_BE_inc': [0.0],
                      '_BE_sub': [0.4],
                      '_BE_cg': [0.0]}}
beh_1 = {start_year: {'_BE_inc': [0.0],
                      '_BE_sub': [0.4],
                      '_BE_cg': [0.0]}}

# behavior dict with static baseline
# beh_dict = OrderedDict([(base_beh, beh_0), ('Dynamic', beh_1)])

# behavior dict w/o static baseline
beh_dict = OrderedDict([('Dynamic', beh_1)])

Policy settings are also defined and included in an ordered dictionary called `ref_dict`. Reforms can be labeled using variables (as done below) or directly in the ordered dictionary. The labels used will be used in any table generation and must be referenced in the calculation of any parameters as part of the auto_notebook. Functions will loop though each reform and apply each behavioral setting in `beh_dict`.

In [5]:
"""
Policy Settings
"""

current_law_title = 'Current Law'
current_law = {start_year: {}}

ref_1_title = 'Repeal ID & Double SD'
ref_1 = {start_year: {'_ID_Medical_hc' : [1.0], 
                      '_ID_StateLocalTax_hc' : [1.0], 
                      '_ID_RealEstate_hc' : [1.0], 
                      '_ID_InterestPaid_hc' : [1.0],
                      '_ID_Charity_hc' : [1.0],
                      '_ID_Casualty_hc' : [1.0],
                      '_STD': [[6350 * 2, 12700 * 2, 6350 * 2, 9350 * 2, 12700 * 2]]}}

ref_2_title = 'Repeal ID, Double SD, & Repeal PE'
ref_2 = {start_year: {'_ID_Medical_hc' : [1.0], 
                      '_ID_StateLocalTax_hc' : [1.0], 
                      '_ID_RealEstate_hc' : [1.0], 
                      '_ID_InterestPaid_hc' : [1.0],
                      '_ID_Charity_hc' : [1.0],
                      '_ID_Casualty_hc' : [1.0],
                      '_STD': [[6350 * 2, 12700 * 2, 6350 * 2, 9350 * 2, 12700 * 2]],
                      '_II_em': [0]}}

ref_dict = OrderedDict([(current_law_title, current_law),
                        (ref_1_title, ref_1), 
                        (ref_2_title, ref_2)])

The next section defines the settings for the summary table to be included in the report. There are two settings used in setting up the table. THe `column_label` is the title of the leftmost reform column. The `column_list` is a series of variables that you wish to include in the table. The table function works by looking up each variable and returning a token, from which the value can be calculated.

In [7]:
"""
The first item in 'column_list' should either be 'Reform',
if the table is to be indexed by policy reforms, or 'behavior',
if the table is to be indexed by behavioral assumptions
"""

column_label = 'Policy'
column_list = ['Reform',
               'Taxpayers Receiving Tax Cut (millions)',
               'Itemizers (millions)',
               'Taxpayers Facing Lower Marginal Tax Rate (millions)',
               'Taxpayers Paying Zero or Less Income Tax (millions)',
               '10-Year Revenue Change, Dynamic (billions)']

Each variable and token is stored in the `table_vars_dict` in the `auto_func.py` file. The corresponding tokens are commands stored as strings that are executed in the table function. The current version of `table_vars_dict` contains the following variables (note that variables can be added to this dictionary as desired).

In [8]:
table_vars_dict = dict([("10-Year Revenue Change, Dynamic (billions)",
                         "currency_fmt(ten_year_cost_PE(calc_cl, calc_dict[key1][key2], beh_dict[key2]))"),
                        ("10-Year Revenue Change, Static (billions)",
                         "currency_fmt(ten_year_cost(calc_cl, calc_dict[key1][key2]))"),
                        ("Deduction Cap, Joint Filers",
                         "double(key1)"),
                        ("Itemizers (millions)",
                         "'{:,.1f}'.format(num_ided(calc_dict[key1][key2]) / 10.0**6)"),
                        ("Taxpayers Facing Lower Marginal Tax Rate (millions)",
                         "'{:,.1f}'.format(lwrMTR_wages(calc_cl, calc_dict[key1][key2]) / 10.0**6)"),
                        ("Taxpayers Paying Zero or Less Income Tax (millions)",
                         "'{:,.1f}'.format(no_inc_tax(calc_dict[key1][key2]) / 10.0**6)"),
                        ("Taxpayers Receiving Tax Cut (millions)",
                         "'{:,.1f}'.format(num_taxcut(calc_cl, calc_dict[key1][key2]) / 10.0**6)"),
                        ("Taxpayers Receiving Tax Hike (millions)",
                         "'{:,.1f}'.format(num_taxhike(calc_cl, calc_dict[key1][key2]) / 10.0**6)"),
                        ("Taxpayers Receiving CTC (millions)",
                         "'{:,.1f}'.format(num_ctc(calc_dict[key1][key2]) / 10.0**6)"),
                        ("Total Charitable Contributions (billions)",
                         "currency_fmt(num_charity(calc_dict[key1][key2]))"),
                        ("Wght. Ave. MTR on Charitable Contributions",
                         "'{:.2f}'.format(100 * charity_wmtr(calc_dict[key1][key2]))"),
                        ("Reform",
                         "key1"),
                        ("Behavior",
                         "key2")])

In this example we include *'10-Year Revenue Change, Dynamic (billions)'* as the last variable in `table_vars_dict`. The table function will look up the variable and return the token string, `"currency_fmt(ten_year_cost_PE(calc_cl, calc_dict[key1][key2], beh_dict[key2]))"`. When executed this token will calculate the ten year cost using dynamics for each calculator defined in the `calc_dict`. The number of calculators in the `calc_dict` is equal to the number of behaviors times the number of policy reforms. Note the last two variables in `table_vars_dict`, `Reform` and `Behavior`. `Reform` allows the inclusion of the policy reform label in the leftmost column while `Behavior` allows the inclusion of the behavior label in the leftmost column. 

Using the inputs already defined, auto_notebook now executes the `auto_calc.py` file. This generates a dictionary of calculators for each combination of behavior and policy. It also generates a dictionary of ten year cost estimates using both static and dynamic methods. This requires quite a bit of computing power and does take some time to render. The remaining functions refer to these dictionaries when assembling the report. 

In [13]:
execfile('auto_calc.py')

----------------------------------------------------------------------
Begin execution of auto_calc.py
----------------------------------------------------------------------
 
You loaded data for 2009.
Tax-Calculator startup automatically extrapolated your data to 2013.
 
----------------------------------------------------------------------
Begin calculator generation
----------------------------------------------------------------------
You loaded data for 2009.
Tax-Calculator startup automatically extrapolated your data to 2013.
You loaded data for 2009.
Tax-Calculator startup automatically extrapolated your data to 2013.
Done
You loaded data for 2009.
Tax-Calculator startup automatically extrapolated your data to 2013.
You loaded data for 2009.
Tax-Calculator startup automatically extrapolated your data to 2013.
Done
You loaded data for 2009.
Tax-Calculator startup automatically extrapolated your data to 2013.
You loaded data for 2009.
Tax-Calculator startup automatically extrapola

The following variables are used in the generation of the report and are relatively self-explanatory.

In [14]:
"""
Document Settings
"""
project_name = 'personal_exemption'
paper_title = 'Repealing the Personal Exemption'
paper_subtitle = 'Powered by Open Source Policy Modeling'
paper_author = ' Alex Brill'
paper_institute = 'American Enterprise Institute'
paper_date = NoEscape(r'\today')
paper_series = 'Tax Brief Series'
paper_bio = 'is a research fellow at the American Enterprise Institute (AEI).'

In [15]:
execfile('auto_func.py')

The following variables are used as the body of the report and are relatively self-explanatory.

In [16]:
# This string is included under the 'policy' heading
policy_string = "Under current law, taxpayers can claim a personal exemption \
for themselves, their spouse, and each qualified dependent. The personal exemption amount will be $4,050 in 2017. \
The actual benefit depends on the taxpayer's marginal tax rate and gross income. \
Taxpayer's can also claim a standard deduction as an alternative to itemizing one's deductions. \
In 2017, the standard deduction will be $6,350 for single filers, $9,350 for head of \
household filers, and $12,700 for married couples filing jointly."

# This string is included under the 'reform options' heading
options_string = "Using the open-source Tax-Calculator, I present the results of two modifications to current law: (1) \
repeal all itemized deductions (IDs) except mortgage interest deduction and charitable deduction and double standard deduction (SD), or \
(2) repeal these deductions, double standard deduction, and repeal the personal exemption (PE). \
These reforms implemented in each iteration of the model."

# the following string is included under the 'modeling notes' heading and 'tax-calculator' subheading
tax_calculator_string = "Tax-Calculator is an open source \
microsimulation tax model that computes federal individual income taxes and Federal Insurance \
Contribution Act (FICA) taxes for a sample of tax filing units for years beginning with 2013. \
The model can be used to simulate changes to federal tax policy to conduct revenue scoring, \
distributional impacts, and reform analysis. As an open source model, Tax-Calculator is under \
constant development and improvement. Therefore, the results reported in this paper will change \
as imporvements are made. The model relies on data from the 2009 IRS Public Use File (PUF). \
These results are generated using Tax-Calculator Version 0.8.3."

# the following string is included under the 'modeling notes' heading and 'modeling assumptions' subheading
modeling_assumptions_string = "The simulation is a partial equalibrium analysis that uses an \
elasticity of taxable income of 0.4. The following itemized deductions are repealed as part of the reforms: \
medical expenses, state and local taxes, real estate, interest, and charity (as labelled in Tax-Calculator)."

Bullet points can be added to the report by defining them and including them in the `bullets_nstd` list. The bullets draw on functions defined in `auto_func.py` to perform calculations. 

In [21]:
# the following bullets are included under the 'comments' section
bullet_0 = ("The budgetary difference between those two options is " +
            str('{:.1f} million'.format(abs(ten_year_cost_PE(calc_dict['Repeal ID, Double SD, & Repeal PE']['Dynamic'],
                                                             calc_dict['Repeal ID & Double SD']['Dynamic'],
                                                             beh_dict['Dynamic'])))) +
            " over 10 years.")

bullet_1 = ("Under current law, " 
            + str('{:.1f} million'.format(num_std(calc_cl) / 10**6)) 
            + " taxpayers are expected to claim the standard deduction, and " 
            + str('{:.1f} million'.format(num_ided(calc_cl) / 10**6)) 
            + " are expected to itemize their deductions in " + str(start_year) + ".")

bullet_2 = ("When all itemized deductions are repealed, the standard deduction is doubled, "
            + "and the personal exemption is kept in place the number of itemizers decreases from, "
            + str('{:.1f} million'.format(num_ided(calc_dict['Current Law']['Dynamic']) / 10**6))
            + " to " 
            + str('{:.1f} million'.format(num_ided(calc_dict['Repeal ID & Double SD']['Dynamic']) / 10**6))
            + " and "
            + str('{:,.1f} million'.format(lwrMTR_wages(calc_cl, calc_dict['Repeal ID & Double SD']['Dynamic']) / 10.0**6)) 
            + " taxpayers face a lower marginal rate.")
            
bullet_3 = ("If in addition the personal exemption is eliminated, " 
            + str('{:,.1f}'.format((num_taxcut(calc_cl, calc_dict['Repeal ID, Double SD, & Repeal PE']['Dynamic']) - num_taxcut(calc_cl, calc_dict['Repeal ID & Double SD']['Dynamic'])) / 10.0**6))
            + " million fewer taxpayers receive a tax cut and "
            + str('{:,.1f}'.format((lwrMTR_wages(calc_cl, calc_dict['Repeal ID, Double SD, & Repeal PE']['Dynamic']) - lwrMTR_wages(calc_cl, calc_dict['Repeal ID & Double SD']['Dynamic'])) / 10.0**6))
            + " million fewer will face a lower rate.")            

bullets_std = []
bullets_nstd = [bullet_0, bullet_1, bullet_2, bullet_3]

The following commands execute `auto_func.py` functions which assemble the LaTeX document and render a PDF. Any individual command can be removed from this as desired by the author.

In [22]:
# produce document
#!    graphs are currently uncommented

doc = Document(project_name, geometry_options = {'top': '1.0in', 'bottom': '1.0in'})
add_packages(doc)
header(doc)
policy(doc)
options(doc)
v2_table(doc)
auto_comments(doc)
notes(doc)
author(doc)
logo_footer(doc)
doc.generate_pdf(project_name + '_Example', clean_tex=False)