# `nu18` vs `n24`

`n24` is the "Number of children eligible for Child Tax Credit" according to [Tax-Calculator documentation](http://open-source-economics.github.io/Tax-Calculator/). `nu18` is the number of people under age 18.

Based on the Child Tax Credit's [definition of child](https://www.thebalance.com/child-tax-credit-3193009), this should be a subset of children under age 18. One criterion is being under age 17, and others limit further within that.

This notebook examines tax units that violate this assumption by having `n24 > nu18`. This is discussed in [taxdata issue #157](https://github.com/open-source-economics/taxdata/issues/157).

*Data: CPS  |  Tax year: 2014  |  Author: Max Ghenis  |  Date run: 2018-02-28*

## Setup

### Imports

In [1]:
import taxcalc as tc
import pandas as pd
import numpy as np
import copy
from bokeh.io import show, output_notebook
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
import urllib as url_lib  # On Python 3.6 use "import urllib.request as url_lib".

In [2]:
tc.__version__

'0.16.2'

In [3]:
sns.set_style('white')
DPI = 75
mpl.rc('savefig', dpi=DPI)
mpl.rcParams['figure.dpi']= DPI
mpl.rcParams['figure.figsize'] = 6.4, 4.8  # Default.

In [4]:
mpl.rcParams['font.sans-serif'] = 'Roboto'
mpl.rcParams['font.family'] = 'sans-serif'

# Set title text color to dark gray (https://material.io/color) not black.
TITLE_COLOR = '#212121'
mpl.rcParams['text.color'] = TITLE_COLOR

# Axis titles and tick marks are medium gray.
AXIS_COLOR = '#757575'
mpl.rcParams['axes.labelcolor'] = AXIS_COLOR
mpl.rcParams['xtick.color'] = AXIS_COLOR
mpl.rcParams['ytick.color'] = AXIS_COLOR

# Use Seaborn's default color palette.
# https://stackoverflow.com/q/48958426/1840471 for reproducibility.
sns.set_palette(sns.color_palette())

In [5]:
# Show one decimal in tables.
pd.set_option('precision', 2)

## Summaries

In [6]:
recs = tc.Records.cps_constructor()
calc = tc.Calculator(records=recs, policy=tc.Policy())
calc.calc_all()

You loaded data for 2014.
Tax-Calculator startup automatically extrapolated your data to 2014.


We only care about records `n24 > 0`.

In [7]:
full = calc.dataframe(['s006', 'nu18', 'n24'])

In [8]:
full['nu18_s006'] = full['nu18'] * full['s006']
full['n24_s006'] = full['n24'] * full['s006']
full['n24_gt_nu18'] = full['n24'] > full['nu18']
full['n24_vs_nu18'] = np.where(full['n24'] > full['nu18'], 'n24 greater',
                               np.where(full['nu18'] > full['n24'], 
                                        'nu18 greater', 'equal'))

Total `nu18` and `n24`.

In [9]:
full['n24_s006'].sum() / 1e6

81.645326520011665

In [10]:
full['nu18_s006'].sum() / 1e6

78.788336770005856

In [11]:
full.pivot_table(index='n24_vs_nu18', 
                 values=['s006', 'n24_s006', 'nu18_s006'],
                 aggfunc=sum)

Unnamed: 0_level_0,n24_s006,nu18_s006,s006
n24_vs_nu18,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
equal,59800000.0,59800000.0,144000000.0
n24 greater,15700000.0,1440000.0,9050000.0
nu18 greater,6110000.0,17500000.0,9770000.0


Limit to `n24>0` for remainder.

In [12]:
df = full[full['n24'] > 0]

In [13]:
df.pivot_table(index='n24_gt_nu18', 
               values=['s006', 'n24_s006', 'nu18_s006'],
               aggfunc=sum)

Unnamed: 0_level_0,n24_s006,nu18_s006,s006
n24_gt_nu18,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
False,66000000.0,69700000.0,37400000.0
True,15700000.0,1440000.0,9050000.0


Drill into records where `n24>nu18` by `nu18`.

In [14]:
df[df['n24_gt_nu18']].pivot_table(
    index='nu18', 
    values=['s006', 'n24_s006', 'nu18_s006'],
    aggfunc=sum)

Unnamed: 0_level_0,n24_s006,nu18_s006,s006
nu18,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0.0,12200000.0,0.0,7710000.0
1.0,3240000.0,1280000.0,1280000.0
2.0,99500.0,55400.0,27700.0
3.0,53700.0,39300.0,13100.0
4.0,88000.0,70400.0,17600.0


In [15]:
df[df['n24_gt_nu18']].pivot_table(
    index='nu18', 
    columns='n24',
    values=['s006', 'n24_s006', 'nu18_s006'],
    aggfunc=sum)

Unnamed: 0_level_0,n24_s006,n24_s006,n24_s006,n24_s006,n24_s006,nu18_s006,nu18_s006,nu18_s006,nu18_s006,nu18_s006,s006,s006,s006,s006,s006
n24,1.0,2.0,3.0,4.0,5.0,1.0,2.0,3.0,4.0,5.0,1.0,2.0,3.0,4.0,5.0
nu18,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2
0.0,4570000.0,4310000.0,2100000.0,860089.17,374394.73,0.0,0.0,0.0,0.0,0.0,4570000.0,2160000.0,699162.46,215022.29,74878.95
1.0,,1640000.0,861000.0,478895.03,253801.17,,821119.77,287157.91,119723.76,50760.23,,821000.0,287157.91,119723.76,50760.23
2.0,,,50400.0,21357.81,27727.52,,,33626.02,10678.91,11091.01,,,16813.01,5339.45,5545.5
3.0,,,,47190.87,6508.3,,,,35393.15,3904.98,,,,11797.72,1301.66
4.0,,,,,88014.73,,,,,70411.79,,,,,17602.95


Share of tax units with `n24>0` where `n24>nu18`.

In [16]:
df.loc[df['n24_gt_nu18'], 's006'].sum() / df['s006'].sum()

0.19482597729249573