Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upssa_national set is not gender balanced #9
Comments
|
Thanks for pointing this out, @bmschmidt. I'll have to take it into account. |
|
First stab at a solution with justification for the reasoning: http://rpubs.com/lmullen/gender-imbalance-ssa |
This commit adds a function that calculates correction factors for skewed gender ratios in the SSA data (as explained in #9). This commit doesn't pass all the tests when the `gender` function is passed a data frame of values and the `years` parameter is a logical. This was a bad design for the function and it is unnecessary because `Map()` and other functional programming methods in R can let the user do this in more sensible ways. The next release will incorporate breaking changes to the way that the `gender` function works, so these tests will be fixed there. Fixes #9
|
Fixed on |
method="ssa"could probably use some adjustment to compensate for something I noticed in the Social Security dataset: before about 1918, it's two-thirds women.I assume this has to do with who was eligible for benefits when the program was created in the late 30s: either the men born around 1900 are dead, or more likely they're not eligible for survivor benefits for spouses or something.
All the ratios for years around 1900 from this method are distorting the female % of the name: for example, in 1901
merlehas 91 women and 52 men counted, but since 69% of the sample is female that year, that male number should be 2.2x higher: the right prediction that it's male, not female.An illustrative plot: