# "Intersectional Income Gaps in the United States as Distances in Sectional-Cost Space"
> "I introduce the idea that the magnitude of the discrimination faced by an individual having some identity is given by the percieved distance of that identity from the social reference identity, and do some not-at-all robust analysis suggesting the idea isn't entirely unreasonable in the context of income gaps in the United States"
- toc: true
- branch: master
- badges: true
- comments: true
- categories: [intersectionality, inequality]
- image: images/some_folder/your_image.png
- hide: false
- search_exclude: true
- metadata_key1: metadata_value1
- metadata_key2: metadata_value2

## Introduction

In this note I introduce a method for predicting the income gap between people of a reference identity and people of a different race and sex from that reference identity, then applies that method to predict income gaps between white males and black, hispanic, and asian females in the United States from 1948 to 2018 using publicly available United States Census Bureau data.

## Model

The model is most easily explained by giving a specific example then generalizing it. To that end, I'll first use the model to predict the income gap between white males and black females, then generalize the model to include the income gaps between white males and hispanic and asian females. 

### Predicting the White-Male Black-Female Income Gap

For now, define an identity as a combination of race and sex, and define the reference identity as a white-male. The model then suggests that the income gap between a black-female and a white-male is predicted by the length of the hypotenuse of a right triangle with sides of lengths given by the income gap between a white-female and a white-male and the income gap between a black-male and a white-male.

Representing the income of a white-male as $WM$, the income of a white-female as $WF$, the income of a black-male as $BM$ and the income of a black-female as $BF$; the example can be visualized as

![](my_icons/intersectionality/triangle.PNG)

For example, in 2017 the $WM-WF$ income gap was $\$18720$ and the $WM-BM$ income gap was $\$15724$. The predicted 2017 $WM-BF$ income gap is then $\$24448$. The actual 2017 $WM-BF$ income gap was $\$22197$, about 9\% less than predicted.

The United States Census Bureau has been collecting income data, including data partitioned by race and sex, since 1947. Making the above calculation for every year in that time range then plotting the resulting income gap predictions against the actual income gaps results in the following

![](my_icons/intersectionality/bf.png)

The correlation between the predicted and actual income gaps is $.998$, which is just silly. I'm probably doing something dumb. The slope and intercept of the linear fit are $.879$ and $296.39$, respectively. In a linear fit $y=mx+b+error$, if $m \approx 1$, $b \approx 0$, and $error \approx N(0,0)$, then $y \approx x$, so, though it systematically overestimates, the triangular combination of the cost of being female and the cost of being black appears to be an excellent estimation of the cost of being a black female, at least for the case of income in the United States. 

## Predicting the White-Male Black/Hispanic/Asian-Female Income Gap

More generally, the income cost of having some identity is the distance of the identity from the origin in sectional-cost space. Sections of identity are orthogonal axis in a Euclidean space, and intersectional identities are those having multiple non-zero sectional coordinates. In two dimensional Euclidean space, the distance between two points can be calculated using the Pythagorean theorem, which is why I used a triangle to introduce the model.

![title](my_icons/intersectionality/space.png)

The interpretation of the model is that the magnitude of the discrimination faced by an identity is the perceived "distance" of that identity from the social reference identity. The white-male white-female income gap is the perceived distance between maleness and femaleness; the white male black/hispanic/asian male income gap is the perceived distance between black/hispanic/asianness and whiteness. 

Hispanic-males are slightly closer to the origin than black-males, meaning hispanics are perceived as slightly less "not white" than blacks. It is potentially noteworthy that the white-male black/hispanic-male income gap has been increasing since 2016, aligning with the increased political rhetoric labeling blacks and hispanics as "others".

![title](my_icons/intersectionality/bhaf.png)

## Higher-dimensional Identities

Of course, in reality, there are an uncountable number of dimensions of identity. Boiling identity down to x gender y race is a projection from this larger space onto two dimensions (principal compnent analysis is relevant here). Inevitably, this projection loses information. Identities which appear close to each other in two dimensions might actually be far apart in a more complete space, and identities which appear to be far apart might actually be close together. For example, disabled white males and able white males are identically positioned in race-sex sectional-cost space. If the space were extended to include ability, disabled white males might be closer to able white females than to able white males. 


Hypothetically, the space can be extended to include as many independent dimensions of identity as can be identified. While there will likely be diminishing returns in terms of predictive capacity, accounting for greater dimensionality of identity might reduce the noise in the predictions and account for some of the model's systematic overestimation. Or it might blow it up. We'll see.