ssa_national set is not gender balanced #9

Closed
bmschmidt opened this Issue Jun 25, 2014 · 3 comments

Comments

Projects
None yet
2 participants
Contributor

bmschmidt commented Jun 25, 2014

method="ssa" could probably use some adjustment to compensate for something I noticed in the Social Security dataset: before about 1918, it's two-thirds women.

I assume this has to do with who was eligible for benefits when the program was created in the late 30s: either the men born around 1900 are dead, or more likely they're not eligible for survivor benefits for spouses or something.

All the ratios for years around 1900 from this method are distorting the female % of the name: for example, in 1901 merle has 91 women and 52 men counted, but since 69% of the sample is female that year, that male number should be 2.2x higher: the right prediction that it's male, not female.

An illustrative plot:

gender::ssa_national %>% group_by(year) %>% summarize(ratio = sum(female)/sum(female+male)) %>% ggplot(aes(x=year,y=ratio)) + geom_line() + labs(title="Percentage of the set that is women")

Illustrative chart

Contributor

lmullen commented Jun 25, 2014

Thanks for pointing this out, @bmschmidt. I'll have to take it into account.

lmullen added the bug label Jun 25, 2014

Contributor

lmullen commented Jul 16, 2014

First stab at a solution with justification for the reasoning: http://rpubs.com/lmullen/gender-imbalance-ssa

@lmullen lmullen added a commit that referenced this issue Jul 22, 2014

@lmullen lmullen Correct skewed gender in SSA data
This commit adds a function that calculates correction factors for skewed
gender ratios in the SSA data (as explained in #9).

This commit doesn't pass all the tests when the `gender` function is passed a
data frame of values and the `years` parameter is a logical. This was a bad
design for the function and it is unnecessary because `Map()` and other
functional programming methods in R can let the user do this in more sensible
ways. The next release will incorporate breaking changes to the way that the
`gender` function works, so these tests will be fixed there.

Fixes #9
a210ee1
Contributor

lmullen commented Jul 22, 2014

Fixed on develop.

lmullen closed this Jul 22, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment