---
title: "Assignment 1 - Solutions"
bibliography: ../reading_list.bib
---


1. Following [Recitation 1](../recitations/recitation-1.qmd), load the [CPS-ASEC data](https://github.com/josephmullins/ECON4538/tree/main/data) and perform the same steps to deflate nominal \$ values to 2017 dollars.

This is just a matter of copy-pasting code:


In [None]:
using DataFrames, CSV, StatsPlots, DataFramesMeta, StatsBase
pce = CSV.read("../data/PCE_index.csv",DataFrame)

data = @chain CSV.read("../data/asec_1970_2019.csv",DataFrame) begin
  innerjoin(pce,on=:YEAR)
  @transform begin 
    # fill in some missing values and deflat
    :EITC = 100 * coalesce.(:EITCRED,0.) ./ :PCE
    :CTC = 100 * coalesce.(:ACTCCRD,0.) ./ :PCE
    :SSI = 100 * coalesce.(:INCSSI,0.) ./ :PCE
    :WELF = 100 * :INCWELFR ./ :PCE
    :SS = 100 * :INCSS ./ :PCE
    # create the income measures
    :EARN = 100 * (:INCWAGE .+ :INCBUS) ./ :PCE #<- deflate earnings (2017 base)
    :LABINC = 100 * :INCWAGE ./ :PCE #<- just labor income
    end
    @transform :GOV = :EITC .+ :CTC .+ :SSI .+ :WELF .+ :SS
end

2. In @Heathcote2023, the *Household Pooling index* $HP_{t}$ is defined as
$$ HP_{t} = \frac{var(y_{it}) - var(\bar{y}_{it})}{var(y_{it})}$$
where $\bar{y}_{it}$ is per-person income for household $i$ at time $t$. Calculate this index when $y$ is defined solely as labor market earnings (i.e. wage earnings) and when $y$ is defined as labor market + business income. Plot both lines to compare them and comment on any differences.

The code block below calculates equivalized income and calculates the pooling measure for each measure of income:


In [None]:
pooling = @chain data begin
    @subset .!ismissing.(:CPSID) #<- drop all obs where this variable is missing (all of 1970)
    @select :CPSID :CPSIDP :YEAR :EARN :INCWAGE
    stack(Not([:CPSID,:CPSIDP,:YEAR]))
    groupby([:YEAR,:CPSID,:variable])
    @transform :value_equiv = mean(:value)
    @groupby([:YEAR,:variable])
    @combine :pooling = (var(:value) - var(:value_equiv)) / var(:value)
    #unstack([:YEAR,:CPSID],:variable,:value)
end

@df pooling plot(:YEAR,:pooling,group=:variable)

This way of making the calculation is a particular weighting scheme that weights each pooled value $\bar{y}_{it}$ according to the number of individuals in the household. An alternative would be the unweighted version, which looks like this:


In [None]:
var_pooled = @chain data begin
    @subset .!ismissing.(:CPSID) #<- drop all obs where this variable is missing (all of 1970)
    @select :CPSID :CPSIDP :YEAR :EARN :INCWAGE
    stack(Not([:CPSID,:CPSIDP,:YEAR]))
    groupby([:YEAR,:CPSID,:variable])
    @combine :value_equiv = mean(:value)
    @groupby([:YEAR,:variable])
    @combine :var_pooled = var(:value_equiv)
end

pooling_alt = @chain data begin
    @subset .!ismissing.(:CPSID) #<- drop all obs where this variable is missing (all of 1970)
    @select :CPSID :CPSIDP :YEAR :EARN :INCWAGE
    stack(Not([:CPSID,:CPSIDP,:YEAR]))
    groupby([:YEAR,:variable])
    @combine :var = var(:value)
    innerjoin(_,var_pooled,on=[:YEAR,:variable])
    @transform :pooling = (:var .- :var_pooled) ./ :var
end

@df pooling_alt plot(:YEAR,:pooling,group=:variable)


Points for either way (even though I think the first method is most sensible).


3. Calculate and plot an equivalent index for government transfers where now $y_{it}$ is per-person household earnings (business + wage income) and $\bar{y}_{it}$ is per-person household income + government transfers. You can follow the variable for transfers we constructed in Recitation 1. Name at least one way that this measure differs from the measure used to construct the Government redistribution index in Figure 19 of @Heathcote2023.


In [None]:
gov_pooling = @chain data begin
    @subset .!ismissing.(:CPSID) #<- drop all obs where this variable is missing (all of 1970)
    @transform :INCPOST = :EARN .+ :GOV
    @select :YEAR :CPSID :CPSIDP :INCPOST :EARN
    stack(Not([:CPSID,:CPSIDP,:YEAR]))
    groupby([:YEAR,:CPSID,:variable])
    @transform :value = mean(:value) #<- could use @combine here for different weighting
    @groupby([:YEAR,:variable])
    @combine :var = var(:value)
    #@combine :pooling = (var(:value) - var(:value_equiv)) / var(:value)
    unstack(:YEAR,:variable,:var)
    @transform :pooling = (:EARN .- :INCPOST) ./ :EARN
end

@df gov_pooling plot(:YEAR,:pooling)


Consulting the appendix in the paper, we see that there are several sources of government transfers used in the paper that we do not use here. Further, we do calculate taxes using TAXSIM as in the paper.

4. Re-compute this index for government transfers for households below the 20th percentile of per-person income in each year. Comment on any differences with the previous figure.


In [None]:
gov_pooling = @chain data begin
    @subset .!ismissing.(:CPSID) #<- drop all obs where this variable is missing (all of 1970)
    @transform :INCPOST = :EARN .+ :GOV
    @select :YEAR :CPSID :CPSIDP :INCPOST :EARN
    groupby([:YEAR,:CPSID])
    @combine :EARN = mean(:EARN) :INCPOST = mean(:INCPOST) #<- could use @transform here for different weighting,
    @groupby(:YEAR)
    @transform :q20 = quantile(:EARN,0.2) #<- keep bottom 20% in each year
    @subset :EARN .< :q20
    @groupby(:YEAR)
    @combine :pooling = (var(:EARN) .- var(:INCPOST)) ./ var(:EARN)
end

@df gov_pooling plot(:YEAR,:pooling)


5. In Recitation 1, we plotted the share of total transfers coming from each of five programs. Re-calculate this share for only the poorest 20% of households (as measured by per-person earnings) and comment on any differences you see compared to the original figure (**extra credit**)


In [None]:
house = @chain data begin
    @subset .!ismissing.(:CPSID) #<- drop all obs where this variable is missing (all of 1970)
    @select :CPSID :CPSIDP :YEAR :EARN :GOV :SS :SSI :WELF :CTC :EITC
    stack(Not([:CPSID,:CPSIDP,:YEAR]))
    groupby([:YEAR,:CPSID,:variable])
    @combine :value = mean(:value)
    unstack([:YEAR,:CPSID],:variable,:value)
end

@chain house begin
    groupby(:YEAR)
    @transform :q20 = quantile(:EARN,0.2) #<- keep bottom 20% in each year
    @subset :EARN .< :q20
    groupby(:YEAR)
    @combine  begin 
        :eitc = mean(:EITC)  / mean(:GOV)
        :ssi = mean(:SSI)  / mean(:GOV)
        :welf = mean(:WELF)  / mean(:GOV)
        :ctc = mean(:CTC)  / mean(:GOV)
        :ss = mean(:SS)  / mean(:GOV)
    end
    stack(Not(:YEAR))
    @df _ plot(:YEAR,:value,group = :variable,linewidth = 3.)
end

Transfers from the child tax credit make up a much smaller portion of total transfers, reflecting the fact that these credits are effectively targeted to higher earning households. 