## Simulation 
We begin by specifying packages, reading in the dataset and generating some functions that we will use repeatedly to calculate poverty rates and gini coefficients as well as to generate simple percentage changes.

In the dataframe, we have spmu unit years as the level of the dataset. All values are averaged across 2018-2020.

In [1]:
import microdf as mdf
import pandas as pd
import numpy as np
import us

person_sim = pd.read_csv(
    "https://github.com/UBICenter/child-allowance/blob/master/jb/data/person_sim.csv.gz?raw=true",
    compression="gzip")

# Define a function to calculate poverty rates from the poverty flag
def pov(data, group):
    return pd.DataFrame(
        mdf.weighted_mean(data, "poverty_flag", "asecwt", groupby=group)
    )

def pov(data, group):
return pd.DataFrame(
    mdf.weighted_mean(data, "deep_poverty_flag", "asecwt", groupby=group)
)

# Define function to generate gini coefficients
def gin(data, group):
    return pd.DataFrame(
        data.groupby(group).apply(
            lambda x: mdf.gini(x, "spmftotres", "asecwt")
        )
    )

# Define percentage change function
def percent_change(new, old):
    return 100 * (new - old) / old

We generate poverty rates for the total population and by demographics of interest, namely sex, race, whether one is a child (under 6) and by state. We similarly generate gini coefficients for the total population and by state.

In [6]:
# Poverty rates by demographics of interest
poverty_rate = pov(person_sim, "sim_flag") # Overall poverty rate
poverty_rate_sex = pov(person_sim, ["sim_flag", "sex"]) # Poverty rates by sex
poverty_rate_race_hispan = pov(person_sim, ["sim_flag", "race_hispan"])  # Poverty rates by race
poverty_rate_child = pov(person_sim[person_sim.child_6], "sim_flag") # Child poverty rate

# State-based poverty rates
poverty_rate_state = pov(person_sim, ["sim_flag", "state"])

# Rename constructed poverty_rates
poverty_rates = [
    poverty_rate,
    poverty_rate_sex,
    poverty_rate_race_hispan,
    poverty_rate_state,
    poverty_rate_child,
]
for i in poverty_rates:
    i.rename({0: "poverty_rate"}, axis=1, inplace=True)

    
# Gini coefficients and state/demographic-based heterogenous gini coefficients
gini = gin(person_sim, "sim_flag")
gini_state = gin(person_sim, ["sim_flag", "state"])

# Rename constructed gini coefficients
ginis = [
    gini,
    gini_state,
]
for i in ginis:
    i.rename({0: "gini_coefficient"}, axis=1, inplace=True)

# Create pivot table to interpret state-based poverty effects
state_pov = poverty_rate_state.pivot_table(
    values="poverty_rate", index="state", columns="sim_flag"
)
# Create pivot table to interpret state-based gini effects
state_gini = gini_state.pivot_table(
    values="gini_coefficient", index="state", columns="sim_flag"
)

KeyError: 'spmftotres'

We then generate state-based poverty rate and gini percentage changes to reflect the impact of the simulation. 

In [None]:

# Generate state-based poverty rate percentage changes
state_pov["poverty_change_cc"] =  state_pov.cc_replacement - state_pov.baseline
state_pov["poverty_change_flat"] = state_pov.child_allowance - state_pov.baseline
state_pov["poverty_change_pc_cc"] = state_pov.poverty_change_cc - state_pov.baseline
state_pov["poverty_change_pc_flat"] = (
    state_pov.poverty_change_flat - state_pov.baseline
)

# Construct state-based gini coefficient percentage changes
state_gini["gini_change_cc"] = state_gini.cc_replacement - state_gini.baseline
state_gini["gini_change_flat"] = state_gini.child_allowance - state_gini.baseline
state_gini["gini_change_pc_cc"] = percent_change(
    state_gini.gini_change_cc, state_gini.baseline
)
state_gini["gini_change_pc_flat"] = percent_change(
    state_gini.gini_change_flat, state_gini.baseline
)

# Re-arrange and present pivot tables, descending by % change
# in poverty rate
state_pov.sort_values(by="poverty_change_pc_flat", ascending=True)
state_gini.sort_values(by="gini_change_pc_flat", ascending=True)

## Analysis

- To do - visualisations and analysis