In [1]:
import pandas as pd

from bokeh.plotting import figure, output_notebook, show
from bokeh.models.annotations import Label
from bokeh.models import ColumnDataSource, LinearColorMapper,BasicTicker, PrintfTickFormatter, ColorBar, NumeralTickFormatter
from bokeh.models.tools import HoverTool
from bokeh.palettes import brewer
output_notebook()

## The Relationship Between Income and Wealth

Does income predict one's overall wealth? If so, how strong is the relationship? This answer to these questions have a lot of implications for public policy in terms of taxation and wealth distribution. Below we've visualized federal data on income and net worth from the the [2016 Survey of Consumer Finances (SCF)](https://www.federalreserve.gov/econres/scfindex.htm).

Both income and wealth are important measures of household financial well-being. Income is the flow of financial resources into a household from wages and salaries, investment returns, government transfer payments, and other sources. Wealth is a household’s total saved resources and is usually measured as net worth (total assets less total debts).

It becomes clear that there is, of course, some correlation between income and net worth - the relationship between income and wealth may seem simple: as income increases, so too should wealth. However the link is not as strong as you might think. In reality, the correlation between income and wealth is positive but relatively low.
 
There are other components that make up the composition of income and wealth that we are ignoring for the purpose of this data visualization, which would need to be considered in a more comprehensive analysis. For example, those with high wealth may receive income from work as well as from interest, dividends, businesses, and other sources; whereas those with low wealth are unlikely to have income from these non-work sources. This does make sence when one considers that income is just a single factor that can contribute to overall net worth. Spending habits, savings, and investments all come into play in building wealth over time.

In [10]:
xl = pd.ExcelFile("data/SCFP2016.xlsx")
xl.sheet_names

df = xl.parse("SCFP2016")
df = df[['INCOME', 'WGT', 'AGE', 'WAGEINC', 'NETWORTH']]

In [15]:
# df['WGT'] = df['WGT']/500
source = ColumnDataSource(df)

p = figure(y_axis_type="log",x_axis_type="log", plot_width=800, plot_height=800)

p.title.text = 'Graphing Income and Wealth'
p.xaxis.axis_label = 'Income'
p.yaxis.axis_label = 'Networth'

colors = brewer['OrRd'][9]
colors = colors[::-1]
mapper = LinearColorMapper(
    palette=colors, low=df.AGE.min(), high=df.AGE.max())
    
p.circle(x='INCOME', y='NETWORTH', source=source, fill_color={
        'field': 'AGE',
        'transform': mapper
    },
    line_color=None, size='WGT')

color_bar = ColorBar(
    color_mapper=mapper,
    major_label_text_font_size="10pt",
    ticker=BasicTicker(desired_num_ticks=len(colors)),
    formatter=PrintfTickFormatter(),
    label_standoff=6,
    border_line_color=None,
    location=(0, 0))

p.add_layout(color_bar, 'right')

hover = HoverTool()
hover.tooltips=[
    ('Age', '@AGE'),
    ('Income', '@INCOME{int}'),
    ('Networth', '@NETWORTH'),
]
p.add_tools(hover)

p.xaxis[0].formatter = NumeralTickFormatter(format="$0,0")
p.yaxis[0].formatter = NumeralTickFormatter(format="$0,0")
p.yaxis.minor_tick_line_color = None
p.xaxis.minor_tick_line_color = None

show(p)

In [31]:
df['INCOME'].corr(df['NETWORTH'])

0.5742691094851456

The x-axis shows household annual incomes, and the y-axis displays net worth. Both scales are logarithmic, so the intervals increase by a factor of 10x. The above data does have some correlation, with a simple correlation coefficient of 0.57. Other studies have shown that the statistical relationship, as measured by R-squared, is around 0.33, meaning that income only explains about a third of one's wealth. The color of the marker represents the age of the head of the household, where darker dots represent households headed by older people.  The size of the marker is the number of households which are supposed to be represented by the sample (scaled to fit the graph), so even though there appears to be a large number of dots in the upper right-hand cornder, they don’t mean as much as those on the lower-left. 

Although the relationship isn't too significant, it is positive, so people with higher incomes do tend to have higher overall wealth to some degree.

### Is this an indicator of Inequality?

The association between income and wealth matters for the discussion of financial well-being, security and inequality. The relationship is an indicator of whether a household is able to turn income into savings rather than spending it on necessities.

While the visualizations in this post nowhere near tell the full picture, there should be no debate about rising inequality worldwide, whether that relates to wealth or income. Pikkety's r > g equation comes to mind here, where r is the rate of return on capital and g the overall economic growth. While there continues to be debate on Pikkety's calculations, his work has nonetheless been pivotal in the study of inequality. If r is indeed greater than g, then general inequality will only increase, as people with more accumulated wealth will continue to earn more on their assets, while the majority, who rely on labor income, will continue to suffer from the growing divide.

Important policy decisions will need to be made to tackle these issues. The first step in this fight should be to actually calculate the factors that determine a household's ability to turn income into wealth. Other solutions include:

- Increasing the minimum wage.
- Higher taxes on wealth.
- Higher marginal tax rates at the top income levels.
- Investment in education.
- Wealth Creation for low-income households (Some variation on Cory Booker's [Baby Bonds](https://www.vox.com/future-perfect/2019/1/21/18185536/cory-booker-news-today-2020-presidential-election-baby-bonds) springs to mind.

...and so much more!

In an exended analysis, one could take a deep dive into the variations of this correlation across different occupations, age groups, regions, income and wealth groups. For now, let's briefly visualize the data for various age groups.

### The Age Factor

It turns out that income is a better predictor for the wealth of people in certain age groups, and a worse predictor for others. It is lowest as an explanatory variable at the youngest age rage (correlation coefficient of 0.13), which makes perfect sense as people in their twenties are just starting out in their careers or are working part time while in college. We can see some heavy outliers at the top right, the wealth-inheritors, one could safely assume. It also decreases for retirees, as mentioned above (0.22). Income predicts wealth for those in the mid-age ranges a lot better (~0.57).

In [16]:
df1 = df[df['AGE'].between(20, 30, inclusive=True)]
df2 = df[df['AGE'].between(30, 40, inclusive=True)]
df3 = df[df['AGE'].between(40, 50, inclusive=True)]
df4 = df[df['AGE'].between(70, 80, inclusive=True)]

In [17]:
def plot_age(df, title):
    source = ColumnDataSource(df)

    p = figure(y_axis_type="log",x_axis_type="log")

    p.title.text = title
    p.xaxis.axis_label = 'Income'
    p.yaxis.axis_label = 'Networth'

    p.circle(x='INCOME', y='NETWORTH', source=source, fill_color={
            'field': 'AGE',
            'transform': mapper
        },
        line_color=None, size='WGT')

    hover = HoverTool()
    hover.tooltips=[
        ('Age', '@AGE'),
        ('Income', '@INCOME{int}'),
        ('Networth', '@NETWORTH'),
    ]
    p.add_tools(hover)

    p.xaxis[0].formatter = NumeralTickFormatter(format="$0")
    p.yaxis[0].formatter = NumeralTickFormatter(format="$0")
    p.yaxis.minor_tick_line_color = None
    p.xaxis.minor_tick_line_color = None

    show(p)

plot_age(df1, 'Age Group: 20s')
plot_age(df2, 'Age Group: 30s')
plot_age(df3, 'Age Group: 40s')
plot_age(df4, 'Age Group: 70s')

Testing a change.

Change on April 20th 2021

Change on April 29th 2021